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The  Army  Munitions  Command  Hparied  by  Major  General  F.  A.  Hanson 
hosted  the  Eleventh  Conference  on  the  Design  of  Experiments  in  Army 
Research,  Development  and  Testing.  This  three-day  meeting  starting 
20  October  1965  was  conducted  at  Stevens  Institute  of  Technology  in 
Hoboken,  NeW  Jersey.  Colonel  Thomas  W .  McGrath,  Deputy  Commander 
at  Headquarters  Army  Munitions  Command,  issued  the  following  letter; 

"It  is  my  privilege  to  welcome  you  to  the  Eleventh  Conference 
on  the  Design  of  Experiments  in  Army  Research,  Development 
and  Testing.  We  consider  it  a  great  honor  to  be  selected  to  serve 
as  host  to  this  important  meeting. 

We  hope  that  each  participant  finds  this  conference  both 
enjoyable  and  professionally  rewarding,  " 

The  Army  Mathematics  Steering  Committee,  sponsors  of  this  confer¬ 
ence  on  behalf  of  the  Office  of  Chief  of  Research  and  Development,  would 
like  to  thank  Colonel  McGrath  for  his  welcoming  remarks.  Members  of 
this  committee  would  also  like  to  thank  General  Hanson  for  making 
available  personnel  under  his  command  to  help  conduct  this  conference. 

In  particular,  many  thanks  are  due  to  Mr.  Henry  DeCic.co,  who  had  the 
main  responsibility  as  Chairman  on  Local  Arrangements  for  coordinating 
the  conference  arrangements  at  the  Command  Headquarters. 

The  program  of  this  meeting  included  6  general,  11  technical,  and 
4  clinical  sessions.  The  invited  speakers  in  the  general  sessions 
featured  the  following  addresses; 

Confidence  Limits  for  the  Reliability  of  Complex  Systems 
Dr.  Joan  R.  Rosenblatt,  National  Bureau  of  Standards 

Non-Linear  Models;  Estimation  and  Design 

Dr.  J.  Stuart  Hunter ,  Princeton  University 

Selecting  the  Population  with  the  Largest  Parameter 

Professor  Robert  E,  Bechhofer,  Cornell  University 

Selecting  a  Subset  Containing  the  Population  with  the  Largest 
Parameter 

Professor  Shanti  S.  Gupta,  Purdue  University 


u 


Target  Coverage  Problems 

Professor  William  C.  Guenther,  University  of  Wyoming 

Maximum  Likelihood  Estimates  for  the  General  Mixed 

Analysis  of  a  Variance  Model 

Professor  H.  O,  Hartley,  Texas  A&M  University  , 

The  conference  was  highlighted  by  the  banquet  held  on  Thursday  evening, 

21st  of  October,  at  Stevens  Center  with  Mrs.  Samuel  Wilks  as  guest  of 
honor,  On  this  occasion  Professor  John  W.  Tukey  of  Princeton  University 
was  presented  the  first  Wilks  Memorial  Medal  Award, 

This  volume  of  the  proceedings  contains  26  of  the  papers  which  were 
presented  at  this  meeting.  The  Army  Mathematics  Steering  Committee 
has  asked  that  these  articles  on  modern  principles  on  the  design  of 
experiments,  as  well  as  applications  of  these  ideas,  be  made  available 
in  the  form  of  this  technical  manual. 

The  Eleventh  Conference  was  attended  by  more  than  150  registrants 
and  participants  from  over  57  different  organizations.  Speakers  and 
panelists  came  from  the  National  Bureau  of  Standards,  Princeton  University, 
Rocketdyne  (A  Division  of  North  American  Aviation,  Inc.),  National 
Institute  of  Mental  Health,  Virginia  Polytechnic  Institute,  North  Carolina 
State  University  at  Raleigh,  University  of  Oklahoma,  George  C.  Marshall 
Space  Flight  Center  (NASA),  Cornell  University,  University  of  Georgia, 
University  of  Tennessee,  Purdue  University,  Texas  A&tM  University, 
University  of  Chicago,  University  of  Wyoming,  George  Washington  Univer¬ 
sity,  and  thirteen  Army  facilities. 

The  chairman  wishes  to  take  this  occasion  to  thank  his  Advisory 
Committee  (Henry  DeCicco ,  F.  G.  Dressel,  Walter  Foster,  Fred  Frishman, 
Bernard  Greenberg,  Boyd  Har shbarger ,  William  Kruskal,  H.  L.  Lucas, 
Clifford  Maloney,  Henry  Mann,  and  W.  Y.  Youden)  for  their  assistance  in 
formulating  the  program  and  their  help  in  selecting  the  invited  speakers, 

He  is^also  grateful  to  the  authors  of  contributed  papers,  chairmen,  and 
panelists.  Without  their  help  this  meeting  could  never  have  succeeded  in 
its  scientific  purposes. 


F,  E.  Grubbs 
Conference  Chairman 
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IN  ARMY  RESEARCH,  DEVELOPMENT  AND  TESTING 


20-22  October  19&5 


Wednesday,  20  October 


0900-1100  REGISTRATION  -  -  Lobby  of  Stevens  Center 


0930-0945  CALLING  OF  CONFERENCE  TO  ORDER  -  -  4th  Floor  Seminar 
Room  1 

Henry  DeCicco,  Chairman  on  Local  Arrangements 

0945-1200  GENERAL  SESSION  1 

Chairman:  Dr.  Walter  D,  Foster,  U.  S.  Army  Biological 
Laboratories,  Fort  Detrick,  Frederick,  Maryland 

CONFIDENCE  LIMITS  FOR  THE  RELIABILITY  OF  COMPLEX 
SYSTEMS 

Dr.  Joan  R.  Rosenblatt,  National  Bureau  of  Standards 
BREAK 

NON-LINEAR  MODELS:  ESTIMATION  AND  DESIGN 
Dr.  J.  Stuart  Hunter,  Princeton  University 

1200-1330  LUNCH 


Technical  Sessions  I  and  II  and  Clinical  Session  A  will  start  at  1330 
and  run  to  1500.  After  the  break  Technical  Sessions  III  and  IV  and  Clinical 
Session  B  will  convene  at  1530  and  run  to  1700. 


1  330-1500  TECHNICAL  SESSION  I  -  -  4th  Floor  Seminar  Room 

Chairman:  Joseph  Mandelson,  Directorate  of  Quality 
Assurance,  U.  S,  Army  Edgewood  Arsenal,  Edgewood, 
Maryland 

A  PROBLEM  GF  DETERIORATION  IN  RELIABILITY 
Henry  DeCicco ,  Quality  Assurance  Directorate ,  U.  S. 
Army  Munitions  Command 
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TECHNICAL  SESSION  I  (continued) 

GAME  THEORY  TECHNIQUES  FOR  SYSTEM  ANALYSIS  AND 
DESIGN 

Jerome  H.  N.  Selman,  Headquarters,  U.  S.  Army  Munitions 
Command,  Dover,  New  Jersey 

1330-1500  TECHNICAL  SESSION  II  -  -  3rd  Floor  Conference  Room 

Chairman:  Badrig  Kurkjian,  Harry  Diamond  Laboratories, 
Washington,  D.  C, 

SYSTEMATIC  METHODS  TO  CALCULATE  FACTOR  EFFECTS 
AND  FITTED  VALUES  FOR  A  2n3m  FACTORIAL  EXPERIMENT 
Barry  H.  Margolin,  U.  S.  Army  Electronics  Command, 

Fort  Monmouth,  New  Jersey 

CONSTRUCTION  AND  COMPARISON  OF  NON-ORTHOGONAL 
INCOMPLETE  FACTORIAL  DESIGNS 
S.  R.  Webb,  Mathematics  and  Statistics  Group,  Rocketdyne, 

A  Division  of  North  American  Aviation,  Inc.  ,  Canoga  Park, 
California.  Rep.  Aerospace  Research  Laboratories,  Office 
of  Aerospace  Research,  U.  S.  Air  Force 

1330-1500  CLINICAL  SESSION  A  -  -  4th  Floor  BCD  Room 

Chairman:  David  Jacobus,  Walter  Reed  Army  Institute 
of  Research,  Walter  Reed  Army  Medical  Center, 

Washington,  D.  C. 

Panelists: 

Dr.  Walter  D,  Foster,  Biometrics  Division,  U.  S.  Army, 
Biological  Warfare  Laboratories,  Fort  Detrick, 

Maryland 

Dr.  Samuel  W,  Greenhouse,  National  Institute  of  Mental 
Health,  Bethesda,  Maryland 

Dr.  Bernard  Harris,  Mathematics  Re  search  Center , 

U.  S.  Army,  University  of  Wisconsin,  Madison,  Wise. 
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Panelists  (continued) 

Professor  Boyd  Harshbarger,  Virginia  Polytechnic 
Institute,  Blacksburg,  Virginia 

Professor  H.  L.  Lucas,  North  Carolina  State  University 
at  Raleigh,  Raleigh,  North  Carolina 

STATISTICAL  ANALYSIS  OF  AUTOMATICALLY  RECORDED 
PHY SIOGRAPH  DATA 

John  Atkinson,  Dir/Medical  Research,  CRDL,  Edgewood 
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AN  APPLICATION  OF  EXPERIMENTAL  DESIGN  IN 
ERGONOMICS:  A  CARDIOVASCULAR  RESPONSE  TO  WORK 
STRESS 

Henry  B.  Tingey  and  William  H.  Kirby,  Jr.  ,  Terminal 
Ballistic  Laboratory,  Ballistic  Research  Laboratories, 
Aberdeen  Proving  Ground,  Maryland 

1500-1530  BREAK 

1530-1700  TECHNICAL  SESSION  III  -  -  4th  Floor  Seminar  Room 

Chairman;  O.  P.  Bruno,  Surveillance  Branch,  Ballistic 
Research  Laboratories,  Aberdeen  Proving  Ground,  Md. 

STRATEGY  FOR  THE  OPTIMAL  USE  OF  WEAPONS  BY 
AREA  COVERAGE 

J,  A.  Nickel,  J.  D.  Palmer  and  F.  J.  Kern,  Systems 
Research  Center,  University  of  Oklahoma,  Norman,  Okla. 
(Representing  the  U.  S.  Army  Edgewood  Arsenal) 

VARIABILITY  OF  LETHAL  AREA 
Bruce  D.  Barnett,  Data  Processing  Systems  Office, 
Picatinny  Arsenal,  Dover,  New  Jersey 

1530-170C  TECHNICAL  SESSION  IV  -  -  3rd  Floor  Conference  Room 

Chairman:  Joseph  Weinstein,  Mathematics  Division, 

U.  S.  Army  Electronic  R  and  D  Laboratory,  Fort 
Monmouth,  New  Jersey 
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TECHNICAL  SESSION  IV  (continued) 

DECISION  PROCEDURE  FOR  MINIMIZING  COSTS  OF 
CALABRATING  LIQUID  ROCKET  ENGINES 
E,  L.  Bombara,  National  Aeronautics  and  Space 
Administration,  George  C.  Marshall  Space  Flight 
Center,  Huntsville,  Alabama 

CALCULATION  OF  THE  THEORETICAL  STRENGTH  OF 
TITANIUM  BY  MEANS  OF  THE  COHESIVE  ENERGY 
Perry  R.  Smoot,  U.  S.  Army  Materials  Research 
Agency,  Watertown,  Massachusetts 

1530-1700  CLINICAL  SESSION  B  -  -  4th  Floor  BCD  Room 

Chairman:  Captain  Douglas  Tang,  Walter  Reed  Army 
Institute  of  Research,  Walter  Reed  Army  Medical  Center, 
Washington,  D.C. 

Panelists: 

Professor  Robert  E.  Bechhofer,  Cornell  University, 
Ithaca,  New  York 

Professor  A,  C.  Cohen,  Jr.,  University  of  Georgia, 
Athens,  Georgia 

Professor  Boyd  Harshbarger,  Virginia  Polytechnic 
Institute,  Blacksburg,  Virginia 

Professor  H.  L.  Lucas,  North  Carolina  State  University 
at  Raleigh,  Raleigh,  North  Carolina 

THE  PATHOPHYSIOLOGY  OF  POISONOUS  SNAKE  VENOMS 
J.  A.  Vick,  H.  P.  Ciuchta,  and  J.  H,  Manthei, 

U,  S.  Army  Chemical  and  Research  Development 
Laboratories,  Edgewood  Arsenal,  Maryland 
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RELATIONSHIP  BETWEEN  LESION  COUNTS  AND  SPORE 
COUNTS 

Thomas  H.  Barksdale,  William  D.  Brener,  Walter  D. 
Foster,  and  Marian  W.  Jones,  Biological  Laboratories, 
Fort  Detrick,  Frederick,  Maryland 


Thursday,  21  October 

Technical  Sessions  V,  VI,  and  VII  will  run  from  08  30  to  1000,  Follow¬ 
ing  the  break,  Technical  Sessions  VIII  and  IX  together  with  Clinical  Session 
C  will  start  at  1030  and  end  at  1200.  After  lunch  Technical  Session  IX  and 
X  along  with  Clinical  Sessions  D  will  be  held  during  the  time  interval 
1330-1420.  The  Panel  Discussion  is  scheduled  to  be  conducted  from  1500 
to  1700,  The  banquet  starts  at  1830. 

0830-1000  TECHNICAL  SESSION  V  -  -  4th  Floor  BCD  Room 

Chairman:  Henry  Ellner,  Directorate  for  Quality  Assurance, 
U.  S.  Army  Edgewood  Arsenal,  Edgewood,  Maryland 

EXTREME  VERTICES  DESIGN  OF  MIXTURE  EXPERIMENTS 
R.  A.  McLean,  Purdue  University  and  the  University  of 
Tennessee,  and  V.  L.  Anderson,  Purdue  University. 
Representing  Army  Research  Office -Durham 

DESIGN  CF  A  VACUUM- BREAKDOWN  EXPERIMENT 
M.  M.  Chrepta,  J.  Weinstein,  G.  W.  Taylor,  and 
M.  H.  Zinn,  Electronic  Components  Laboratory,  U.  S. 

Army  Electronics  Command,  Fort  Monmouth,  New  Jersey 

0830-1000  TECHNICAL  SESSION  VI  -  -  3rd  Floor  Seminar  Room 

Chairman:  Albert  Parks,  Harry  Diamond  Laboratories, 
Washington,  D.  C. 

MODEL  SIMULATION  OF  BIO-CELLULAR  SYSTEMS 
George  I.  Lavin,  Terminal  Ballistic  Laboratory,  Ballistic 
Research  Laboratories,  Aberdeen  Proving  Ground,  Md. 
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TECHNICAL  SESSION  VI  (continued) 

SOME  INFERENTIAL  STATISTICS  WHICH  ARE  RELATIVELY 
COMPATIBLE  WITH  AN  INDIVIDUAL  ORGANISM  METHODOLOGY 
Samuel  H.  Revusky,  U.  S.  Army  Medical  Research  Laboratory, 
Fort  Knox,  Kentucky 

08  30-1000  TECHNICAL  SESSION  VII  -  -  4th  Floor  Seminar  Room 

Chairman:  A.  Bulfinch,  Picatinny  Arsenal,  Dover,  N,  J, 

CONTROL  OF  DATA-SUPPORT  QUALITY 
Fred  S.  Hanson,  Plans  and  Operations  Directorate, 

White  Sands  Missile  Range,  New  Mexico 

DESIGNS  AND  ANALYSES  FOR  THE  INVERSE  RESPONSE 
PROBLEM  IN  SENSITIVITY  TESTING 
M.  J,  Alexander,  and  D,  Rothman,  Mathematics  and 
Statistics  Group,  Rocketdyne,  A  Division  of  North  American 
Aviation,  Inc.  ,  Canoga  Park,  California.  Representing 
George  C.  Marshall  Space  Flight  Center,  NASA,  Huntsville, 
Alabama 

1000-1030  BREAK 

1030-1200  TECHNICAL  SESSION  VIII  -  -  4th  Floor  BCD  Room 

Chairman:  F.  L.  Carter,  U.  S.  Army  Biological  Laboratories , 
Fort  Detrick,  Frederick,  Maryland 

MONTE  CARLO  INVESTIGATION  OF  THE  PROBABILITY 
DISTRIBUTIONS  OF  DIXON'S  CRITERIA  FOR  TESTING  OUT¬ 
LYING  OBSERVATIONS 

Walter  L.  Mowchan,  Surveillance  Branch,  Ballistic  Research 
Laboratories,  Aberdeen  Proving  Ground,  Maryland 

TABLES  AND  CURVES  FOR  ESTIMATING  DEGREES  OF 
FREEDOM  FOR  A  TWO  POPULATION  "T"  TEST  WHEN  THE 
STANDARD  DEVIATIONS  ARE  UNKNOWN  AND  UNEQUAL 
E,  Dutoit  and  R.  Webster,  Quality  Assurance  Directorate, 
Ammunition  Reliability  Division,  Mathematics  and 
Statistics  Branch,  Picatinny  Arsenal,  Dover,  New  Jersey 
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4th  Floor  Seminar  Room 
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Chairman:  Paul  C.  Cox,  Reliability  and  Statistics  Office, 
Army  Missile  Test  and  Evaluation  Directorate,  White 
Sands  Missile  Range,  New  Mexico 

DELETING  OBSERVATIONS  FROM  A  LEAST  SQUARES 
SOLUTION 

Charles  A.  Hall,  2nd  Lieutenant,  Technical  Services 
Division,  White  Sands  Missile  Range,  New  Mexico 

PRECISION  AND  BIAS  ESTIMATES  FOR  DATA  FROM 
CINETHEODOLITE  AND  FPS-16  RADARS 
Burton  L.  Williams,  Range  Instrumentation  Systems 
Office,  White  Sands  Missile  Range,  New  Mexico 

1030-1200  CLINICAL  SESSION  C  -  -  3rd  Floor  Seminar  Room 

Chairman;  Dr.  Fred  Hanson,  Plans  and  Operations 
Directorate,  White  Sands  Missile  Range,  New  Mexico 

Panelists: 

Professor  H.  O,  Hartley,  Texas  A  and  M  University, 
College  Station,  Texas 

Professor  J.  Stuart  Hunter,  Princeton  University, 
Princeton,  New  Jersey 

Professor  William  Kruskal,  University  of  Chicago, 
Chicago,  Illinois 

Dr.  Henry  B.  Mann,  Mathematics  Re  search  Center, 

U.  S,  Army,  University  of  Wisconsin,  Madison,  Wise. 
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ESTIMATION  AND  DESIGN  FOR  NON-LINEAR  MODELS 
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Princeton  University 


The  object  of  this  paper  is  to  survey  current  work  in  estimation  and 
design  for  non-linear  models.  The  problems  of  estimation  for  linear 
models  are  first  reviewed,  taking  recourse  to  geometric  arguments,  and 
the  distinctions  between  linear  and  non-linear  estimation  problems 
described.  Techniques  for  the  estimation  of  parameters  in  non-linear 
models  are  then  discussed:  linearization  of  the  model  and  the  Gaussian 
Iterant,  linearization  of  the  sums  of  squares  function,  direct  search, 
elimination  of  linear  parameters,  and  linearization  of  the  Normal  equa¬ 
tions.  Borrowing  heavily  from  the  papers  of  G.  E.  P.  Box  and  his 
co-workers,  the  problems  of  non-linear  design  are  next  discussed, 
both  for  the  number  of  observations  fixed,  and  for  sequential  non-linear 
designs.  The  emergence  of  intrinsic  designs  appropriate  to  individual 
non-linear  models  is  noted. 

Consider  a  response  function  expressed  in  terms  of  the  general 
model 

(l)  *1  =  ^ i  £  j’  '  '  '  ’  ® i*  ®  2 ’  '  '  1  ’  ® p) 

where  q  is  a  response,  the  £  i  -  1,2,  ....  k  are  k  variables 

or  factors  under  the  control  of  the  experimenter  and  the  0  .,  j  =  1,2,  .  .  ,  p 
are  p  parameters  whose  values  are  unknown.  J 

Two  classes  of  models  will  be  discussed  in  this  paper;  linear  and 
pon-linear.  Some  examples  of  linear  models  are; 


p-i 

q  -  0  +  E  0  .  £  .  or 

°  J-l  J  J 


p-1 

q  =  0  +  z  e.gU). 

j=l  3  3 


where 


are  functions  solely  of  the 


as ,  for  example , 


Examples  of  non-linear  models  are; 


A  t 
v2^| 

r\  =  e^l-e  )  or  q  =  0j  -  Q  ^ 


0  .. 
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'ij€ 


m  -97^, 

-  or  t|  =  £  9 ,  ,e 

1  j=l  " 


the  growth  curve,  the  Clausius -Clapyion  equation  from  thermodynamics, 
and  the  sum  of  exponential  decay  curves  respectively.  A  clear  distinc¬ 
tion  between  linear  and  non-linear  models  will  be  made  shortly. 

Consider  now  u  =  1,2.  ....  n  settings  of  the  controlled  variables 

and  the  corresponding  responses  iq  =  lu>  ^  2u' '  '  '  ’  ^  ku’ 6 1  ’  0  2 . 6  ) 

or,  in  matrix  notation  U  ^ 


(2) 


th 


where  £  =  (1  x  k)  row  vector  of  the  u —  setting  of  the  controlled  vari- 

<u 

ables  and  0  =  (p  x  1)  column  vector  of  unknown  parameters.  The  total 
array  of  settings  of  the  controlled  variables  generates  an  n  x  k  matrix 
£  consisting  of  the  n  row  vectors  £  . 

Of  course,  for  a  ^  we  will  not  observe  the  true  response  but 

rather  record  an  observation  y  where  y  =  q  +  c  ,  or, 

7u  7u  u  u 


(3) 


X  *3  +  l 


‘  1 
M 


where  ^  =  n  x  1  vector  of  observations 
rj  s  n  x  1  vector  of  responses 
t  =  n  x  1  vector  of  disturbances. 

In  all  that  follows  the  individual  disturbances  e  are  considered  to  be 

random  events.  Normally  distributed  with  zero  mean  and  homogeneous 

2  T  2  ' 

variance  cr  ,  that  is,  E(e)  =  0;  E(«e  )  =  L ,  cr  ,  Thus  the  joint 

'"U  '  'V'V  rojsj  J 

probability  density  function  for  the  observations  is: 


„  - 1  (yu-i/A»2 

p(X>  *  <7^  ' 


2c r 
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Once  the  model  n  =  tit  ,  e)  is  givci*  wc  obtain; 

(4)  p(*  |t.e.O  =  (7fc)e 

2 

Since  we  will  know  jr,  ^  and  c  the  likelihood  function  for  the  para¬ 
meters  6  in  the  model  are  given  by 

'V 

-  E  [y  -fU  ,  e)]  2/2°-2 
2  1  n  u=l  u  ~ 

(5)  L(l  X‘f,<r  >  *  <7S>  e 

Our  objective  now  is  to  find  those  values  6  of  the  parameters  which 
maximize  the  likelihood  function,  or,  equivalently,  the  logarithm  of 
the  likelihood  function 

(6)  £  =  fnL  =  -  |  £n(2n<r2)  -  2  6)]  2. 

2<r 

A 

Thus,  the  maximum  likelihood  estimates  0  are  obtained  when  the 
sum  of  squares  function 

<’)  s(s>  =  =  tyu-£<j|u.  e)l 2 

U 

is  minimized,  e.g.  ,  when  the  least  squares  estimates  S  are  obtained, 
thus 

<•>  S» 2  ■ 

where  $  =  f(|  ,  §)  are  the  predicted  values  [l]  . 

XJl  *v 

It  will  be  helpful  now  to  discuss  least  squares  geometrically  [2]  . 
In  this  discussion,  in  order  to  "see"  what  is  happening,  we  will 
restrict  ourselves  to  problems  in  which  the  number  of  observations 
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n  =  3  and  the  number  of  parameters  p  =  2.  For  n  >  3  and  p  >  2, 

(n  >  p),  the  reader  is  asked  to  use  his  imagination  and  remember  that 

the  rules  ol  geometry  employed  will  apply  whatever  the  number  of 

dimensions.  Suppose  an  experimenter  wishes  to  fit  the  linear  model 

y  =  6  4  +6,4,  +  «  and  that  for  each  of  three  settings  of 

'u  o*o  l*lu  u  B 

4q  and  4  ^  he  records  a  single  observation  y^  as  given  in  Table  I 
and  displayed  in  Figure  1. 


TABLE  1 


1- 

-o 

ll 

X 

y  for  G  =  10,  9  =4 
o  i 

y  -  y 

1 

2 

18.4 

18.  0 

0.4 

1 

1 

14.2 

14.  0 

0.  2 

1 

4 

24.8 

26.  0 

-1.  2 

The  elements  of  the  observation  vector  £  provide  the  coordinates 
of  a  point  in  the  n  =  3  dimensional  "observation  space"  as  illustrated 
in  Figure  la.  The  line  segment  joining  this  point  to  the  origin  is 
called  the  observation  vector.  Since  there  are  k  =  2  unknown  para¬ 
meters  in  the  postulated  model  we  can  imagine  a  second  coordinate 
system  called  the  "parameter  space"  as  illustrated  in  Figure  lb. 
Suppose  now  the  experimenter  chooses  for  his  initial  values  of  the 
parameters  9^  =  10  and  9^  =  4,  thus  locating  the  point  0^  in  the  para¬ 
meter  space.  Associated  with  9  will  be  the  point  ^  in  the  observa¬ 
tion  space  determined  by  the  prediction  equation  y^  =  104  Q  +  44  ^  as 

illustrated  in  Figure  lc.  (The  coordinates  for  ^  are  also  given  in 
Tabie  I.  )  In  fact,  for  every  point  0  in  parameter  space  an  associated 
point,  can  be  located  in  the  observation  space.  Remarkably,  the 
surface  generated  by  the  predicted  values  y  will  be  flat.  In  this 
simple  example  they  form  a  k  =  2  dimensional  plane  as  illustrated 
in  Figure  lc.  The  distance  squared  from  the  point  ^  to  the  point 
^  is  given  by 

*•>  =  £lyu-y/  =  s  lVf<^  !>’ 2  ■ 

U  U 


figure  1<X 


PARAMETER  SPACE 

Piguir*  1b 


PARAMETER  SPACE 


observation  Space 
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F rom  Table  I  we  see  that  S(0 ;  =  1.  64. 
in  Figure  Id  at  the  point  6  . 


This  sum  of  squares  is  rccuiucd 


u  •  U 


Our  objective  now  is  to  locate  the  point  ^  on  the  prediction  surface 
closest  to  the  observation  point  y,  or  equivalently,  of  finding  the  point 
8  in  the  parameter  space  such  that  S(8)  =  Z  (y  -y  the  length  squares 

of  the  vector 
squares  point 

$  and  y  are  the  associated  points  on  the  prediction  sub-surface  in  the 
n-space  of  the  observations.)  Differentiating  S(0)  with  respect  to  each 
the  p  parameters  0  and  setting  these  expressions  equal  to  zero  gives 
the  p  "normal"  equations: 


(y  -  y),  is  smallest.  (The  symbol  0  indicates  the  leaBt 
and  e  any  other  point  in  the  parameter  space.  Similarly 


(9) 


0(S(0)) 


89 


j 


■  2  =  (v^u*^  [w-  =0 


or  in  matrix  notation; 

(10) 


x  [y  -$)  =  o 


where  X  =  n  x  p  matrix  of  derivatives  whose  elements  are 


9f(lu.9) 

80*]  ’ 
J 


y  a  n  x  1  vector  of  observations, 

y  =  n  x  1  vector  of  predicted  values, 

The  "normal"  equations  guarantee  that  the  vector  y  -  y  will  be 

perpendicular  (normal)  to  the  prediction  surface  and  hence  that  the 

length  squared  of  this  vector  S(8)  is  a  minimum.  Now  when  the  model 

q  =  f(i;  ,  9)  is  linear ,  the  response  vector  rj  may  be  written  as 
u  u  ~  — — — .  ~ 

rju  B  6*  Further,  Eq.  (9)  may  now  be  written  S(0)  =(y-^9)  (y'l®)- 
When  we  construct  the  normal  equations,  the  elements  of  the  u*h  row 
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ot  the  matrix  of  deri vatic d  X  arc  simply 


9fUu  .J§> 


ae 


=  p 
<u 


The  para¬ 


meters  0  disappear  upon  differentiation  and  wel  have  simply  that  the 
matrix  of  derivatives  X  =  £  .  Since  £  =  |  6  equation  (10)  becomes 


(11) 

Solving  for  0  gives 

(12) 


or  (|T|)| 


a  ( uT  h-LT 

®  =  (|  |)  I .  X 


tT 

s  I  X 


the  familiar  least  squares  solution  for  the  coefficients  in  a  linear  model. 

The  analysis  of  variance  table  now  becomes  nothing  more  than  the 
resolution  of  the  observation  vector  ^  into  orthogonal  components, 
the  degrees  of  freedom  column  merely  keeping  track  of  the  number  of 
dimensions  in  which  the  corresponding  vectors  are  free  to  move.  Thus 
we  have  in  general  (n  observations,  p  parameters): 

(13)  Analysis  of  Variance  Table 

Sum  of  Square  Degrees  pf  Freedom 

Total  Sum  of  Squares 
(Length  Squared  Observation  Vector) 

Regression  Sum  of  Squares 


T 

X  X 


n 


(Length  square  ,  Vector  of  Pre¬ 
dicted  Values) 

Residual  Sum  of  Squares 

(Length  Squared  of  Vector  of 
Residuals) 


inTa  a  T  tT  k  a 

x  2  =  l  4  4S 


s(0)=(rx)  (x-J)  p=n-p 
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FIGURE  2 


Vector  of 
residuals 


In  our 


example  we  have  (remembering  that  for  this  linear  model 


=  X  = 

‘l 

2* 

i  l  = 

“18.4“ 

1 

1 

14.2 

1 

4 

24.8 

(xTx) 


%  •  (XTX)‘1XTy 

^  '  r\,  '  (\»  'Vi 


3  7 

go 

= 

'  57.4' 

7  21 

150.  2 

L  J 

-7 

21 


57.4 
150.  2 


11.  0000 

3.4857 


J  L 


"l 

2“ 

[ll.  oooo' 

"17.  9714" 

'  0.4286" 

l 

1 

[3.4857 

= 

14.4857 

c*< 

1 

II 

-0. 2857 

l 

4 

24. 9428 

-0. 1428. 

Total  SSq.  y  ^ 

T  A 

Regression  SSq.  £  ^ 


1155.2400  3 

1154.9500  2 


Residual  SSq.  =  0,  2900 


1 


I 
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The  residual  sum  of  squares  S(0)  =  0.  2900  in  the  Analysis  of  Variance 
Table  is  obtained  by  subtraction.  Using  our  vector  of  estimated  residuals 
S(j§)  =  0.2857.  The  failure  of  these  two  values  of  S(9)  to  agree 
exactly  is  due  only  to  rounding  errors. 

Granting  the  model  is  correct,  and  that  the  observations 
yu  =  N(ti^,o-  2)  then  S(fe)/v  =  s^  estimates  tr^  with  v  =  (n-p)  degrees 

of  freedom.  Further  E(S)=  0  ;  V(§)  =  (X^X)  and  in  fact  the  0 

T  "*  1  ?  -* 

are  distributed  in  a  multivariate  Normal:  N(0;[X  X]  crc).  Let  0  be 

A  *\j  ^  'V 

specific  values  for  the  parameters  postulated  by  the  experimenter.  To 
determine  whether  the  least  squares  estimates  0  are  reasonable  in 
the  light  of  this  hypothesis  we  may  now  perform  the  F  test 


(14) 


P.  v 


S(f)/v 


/p 


t 


! 


If  this  observed  value  of  F  is  such  that  Prob  {F  >  F  }<  a 

p,  v  p,  v  —  p,  v,ai  — 

we  reject  the  hypothesis  that  the  parameters  could  in  fact  equal  0  . 

A  geometric  view  of  this  testing  procedure  is  given  in  Figure  3.  Here 

we  see  the  observation  point  the  point  on  the  solution  locus  ^  which 

is  closest  to  and,  finally,  the  point  determined  from  the  model 

i  *  It  • 


FIGURE  3 

The  resulution  of  the  observation  vector  ^  having  its 
origin  at  the  point  rj  =  ^8 


..  .......  ui . ’“‘■■hi. 
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Accepting  the  hypothesis  that  ti  =£9  is  correct,  then  the  vector  ^-rj 
is  hue  to  random  variability  alone.  The  length  squared  of  thi“  wrtnr 
T  2 

)  (^-ij)  is  then  distributed  as  a  \  with  n  degrees  of  freedom. 
Since  is  normal  to  the  solution  locus  which  contains  t|-£  we  have, 

thanks  to  Pythagorous; 


or 

s(e)  =  s(|)  +  (e-|)TiT|(6-i) 

(15) 

or 

s(e)  =  s(&)  +  s(e-§) 

2 

and  since  the  errors  are  independent  S(§)  is  distributed  as  an  x.  with 

v  =  n-p  degrees  of  freedom  and  S(0-%)  distributed  as  with  p 

degrees  of  freedom.  Thus  the  ratio-  given  in  Eq.  (14)  is  distributed  as 

F  .  We  also  observe  that  with  the  exception  of  the  constant  p 
P»  n-p 

and  n-p  that  the  F  ratio  is  in  fact  equivalent  to  the  cot^  , 
where  Iff  is  the  angle  between  and  ij  •  When  the  angle  tyf  is 

small  (and  hence  tj  far  from  £  or  equivalently  0  far  from  |) ,  F 
will  be  large. 


The  boundary  of  the  (l-a)%  confidence  region  of  9  is  obtained  by 
substituting  in  Eq.  (14)  the  ^  ^  critical  value  arid-  solving  the 

resulting  expression  for  9,  thus 


(16) 


<«-!>  4 4(H) 


F 

p.  v,a 


a  quadratic  form  in  the  6  ;  (1),  (3).  An  illustration  of  this  boundary 
(for  p=2)  is  displayed  in  Figure  4  by  the  dashed  ellipse. 
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Y  produced  for  various  values  of  the  0  will  now  produce  a.  curved 
prediction  subspace  as  illustrated  in  Figure  5a.  In  the  parameter  space, 
the  contours  of  the  sums  of  squares  s(tt)  will  produce  elongated  and 

FIGURE  5 


Geometric  Interpretation  of  Non-Linear  Least  Squares 


Figure  5a  Figure  5b 

twisted  elliptical  shapes  as  illustrated  in  Figure  5b,  However,  the 
maximum  likelihood  estimates  of  the  parameters  still  require  that  we 
locate  the  point  in  the  prediction  subspace  closest  to  y,  or  equivalently, 
find  the  point  6  in  the  parameter  space  where  3(9)  is  smallest. 

Thus,  we  form  the  normal  equations  X  [y"^]  =  0  except  that  this  time 

the  derivative  matrix  X  consists  of  elements  x  , ,  u=l , . ,  .  ,n;  j=l . . 

containing  the  G's.  In  general,  we  have  J 


■■ 


or,  for  our  example,  since  =  O^e 


e2  l„ 


(18)  X  = 


e®2^  6l«1ee2{l 

e®2*2  Qi|2e92?2 

e02«3  01$3ee2^ 
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.  >T 
ana  k 


I?  .il  —  «u+..«, 

[  U  f  J.  >  -X  J  »»V 


Op  -  6  p 

e  2  0^  2 

4^2  „„  402 

e  ^  48^6  6 


To  find  those  values  Gj  and  0  that  will  satisfy  the  conditions  of 
T 

the  normal  equation  X  (y-y)  =  0  is,  usually,  a  very  difficult  task. 

We  now  discuss  some  of  the  various  methods  proposed  for  locating  the 
point  6  , 

Linearize  the  Model; 

Since  the  model  is  non-linear,  we  convert  it  to  a  linear  model 
(approximately)  by  expanding  the  model  in  a  1st  order  Taylors  series 
about  some  set  of  initial  guessed  values  of  the  parameters  G'0/,  Thus 

v  -Hi  e{0)  +  I  (e  0(o))t^^l 

(19)  yu'f(V®  +  -V  j  j 

J  4  J  a  _  n  o  ' 


y  -  y 

u  'u 


(0>  =  s  X  . 

u  jal  3  J 


a  set  of  n  linear  equations  in  the  p  unknowns  (0,-gI®^)  where 

(0)  J  J 

is  the  predicted  value  of  the  response  for  the  initial  guessed 
values  and  x  .  are  the  derivatives  evaluated  at  6^.  In  matrix 

^  UJ  'V- 

notation  we  have 

(21)  6y  =  X(60) 
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where 


=  (n  x  1)  vector  of  deviations  (^-^°) 


=  (n  x  p)  matrix  of  derivatives  x  . 

uj 

=  (p  x  1)  matrix  of  corrections  (0.-0^) 

J  J 


Since  our  model  is  now  a  linear  one  we  can  solve  for  6  9  giving 

6  0  =(XTX)'1  XT(6X)  . 


Once  we  have  the  estimated  corrections  (6  6)  we  begin  anew  with  new 

values  of  the  parameters  0^  =  6^  +  6*0  and  continue  the  iteration 
until  the  estimated  corrections  60  are  not  different  from  zero.  In 
actual  practice  the  full  correction  6b  is  usually  not  employed  but 
rather  corrections  proportional  to  ^  6))  ,  that  is  v  ob  where 
0  <  v  <  1;  (4),  (5).  This  method  of  locating  £  is  often  called  the 
Gauss-Newton  or  simply  the  Gaussian  Iterant. 


For  example,  suppose  we  are  given  the  model  =  6^+  e 


■e,« 

£  u 


and  that  we  record  three  observations  y  =  ri  +  t  where  the  « 

u  '  u  u  u 

are  Normal  and  independently  distributed  N( 0 ,  cr .  The  vector 
giving  the  levels  of  the  controlled  variable,  and  the  associated  response 
vector  ^  are  given  in  the  following  table.  Let  the  initial  estimates 

0^  be  0^^  =  10  and  6.j  =  1.1.  The  vector  of  predicted 

values  x(o)  and  deviations  6^  are  also  given  in  the  table. 


TABLE  2 


y^slO+e"1,1* 


13.  35 


12. 16 


Zrvx 

F1-  351  fxs*yz 


16.  49j 


-1.16  ;  S(0«)=Z:(6y)  =383.085 


19.  51 
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2  u 

The  derivatives  of  the  model  0,  +  e  with  respect  to  the  parameters 


=  xul  =li 


x  ,  =  -  ?  e 
u2  u 


-9<°>i 
1  u 


giving  for  the  matrix  of  derivatives; 


1  3.  6889 

1  1.  5119 

1  11.0301 


a  TIT 

Solving  now  for  the  corrections  6  0  =  (X  X)  X  (6^)  gives 


-7. 0084 


L 2. 3427 


(1)  (0)  A 

and  hence  a  new  set  of  values  for  0,  that  is  0V  =  0'  '+  (8  0) 


7. 0084 


2. 3427 


2.  9916 


3.  4427 


These  values  0  are  now  employed  in  another  iteration,  and  the 
process  is  repeated  until  (hopefully)  the  estimated  correction  bb 
vanish.  In  thit.  example  the  fifth,  sixth  and  seventh  iteration  gave 

(5)  [4.971]  .  (6)  _  [5.033]  (7)_  5.  035 

S  ”  [2.  °30j  ’  °  [2.014]  '  Z  -[2.013 


S(0^)  =  9.1117 


S(e^)  =  8.4131  S(0<7>)  =  8.4128 


The  fitted  model  was  taken  to  be 
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a  ,  ,  -2. 013« 

y  =  5.  0 35  +  e 


These  calculations  are  taken  from  introduction  material  appearing  in  a 
Master's  Thesis  by  Norman  Dahl,  Princeton  University,  1963(6). 

Linearize  the  Sum  of  Sauares  Function 


To  locate  the  values  of  the  9 's  which  reduces  the  sum  of  squares 
function  S(e)  to  a  minimum,  we  may  use  standard  response  surface 
techniques  (7),  (8).  Here  the  sum  of  squares  function  is  approximated 
locally  by  a  polynominal  linear  in  the  parameters.  The  response  is  the 
sum  of  squares  S(0)  for  each  chosen  set  of  the  p  parameters  6,  as 
illustrated  for  p  =  2  parameters  in  Figure  6. 

FIGURE  6 


Locating  S($)  by  Response  Surface  Methods 


\  \ 


S(9)  =  Sum  of  Squares 
Contours 


Path  of  Steepest  Descent 
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Suppose  the  experiment  began  with  the  guessed  values  of  6  illustrated 
by  Lhe  simplex  design  in  the  lower  left  hand  portion  of  Figure  6.  Upon 
computing  S(0)  at  each  of  these  settings  the  path  of  steepest  descent 
can  then  be  determined  as  indicated  by  the  arrow  in  the  figure.  Trials 
along  this  path  lead  to  the  bottom  of  the  trough.  In  practice,  the  size 
of  the  steps  along  the  gradient  can  seriously  effect  the  speed  of  conver¬ 
gence  of  the  iteration,  and  several  proposals  have  been  made  for 
adjusting  the  size  of  the  steps  to  be  taken  [9]  ,  [10]  .  It  is  occasionally 
possible,  as  illustrated  in  Figure  6,  to  employ  a  second  order  design, 
and  approximating  polynomial  in  the  6 's,  and  empirically  determine 
the  curved  nature  of  the  S(0)  contours,  This  additional  information  is 
useful  in  determining  the  direction  of  the  trough. 

For  the  case  where  p  =  2  or  3  it  is  often  possible  to  determine 
s(e)  everywhere  on  a  grid  of  values  of  0  ,  thus  permitting  the  contours 
of  S(0)  to  be  sketched  in  by  hand.  The  position  of  S(G)  can  then  be 
determined  directly.  This  brute  force  method  is  admissible  only  for 
p  small,  and  where  computation  is  both  very  fast  and  economical. 

Direct  Search 


Direct  Search  [l]  is  a  method  for  determining  S(0)  which  does  not 
employ  any  one  strategy  unless  there  is  a  demonstrable  reason  for 
doing  so.  One  direct  search  routine,  called  'pattern  search'  has  proved 
useful.  Initially  a  'good'  point  0  is  chosen  in  the  parameter  space  and 
s(e)  computed.  Then  the  p  individual  values  of  0  are  changed  a 
'basic'  step  in  a  one  at  a  time  fashion  and  S(  0)  evaluated  each  time. 

This  information  is  used  to  design  a  pattern  indicative  of  the  likely 
direction  for  successful  moves.  A  pattern  move  is  now  made.  If 
successful,  (that  is,  S(0)  is  reduced)  each  of  the  p  values  of  0  at  the 
new  base  point  are  changed  a  basic  step  to  see  if  the  pattern  may  be 
improved.  All  steps  indicative  of  an  improvement  are  now  added  to 
all  the  previous  steps  to  form  a  new  pattern  and  the  pattern  move  employed 
anew.  The  originators  of  the  method  (R.  Hooke  and  T.  A,  Jeeves)  note 
that  once  a  pattern  becomes  established  it  will  often  grow  until  the  pattern 
moves  are  as  much  as  100  times  as  large  as  the  basic  steps.  When  a 
pattern  move  fails  to  reduce  S(9)  the  authors  propose  starting  a  com¬ 
pletely  new  pattern  off  the  current  best  point. 

Elimination  of  Linear  Parameters 

Often  a  model  t)  =  f(i[  ,0)  contains  parameters  that  may  be  defined 
as  "linear",  that  is,  upon  differentiating  the  function  f(|  ,0)  with  respect 
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to  a  "linear"  parameter,  all  the  parameters  disappear  in  the  derivative. 

TT* _ _ _ 1  -  _ i  .1  -  .  i  i 

*  wa  &  )  LUUO  iUC  ±  WAAG  iUUUCi 

02  ^ 

n  =  +  e 


and  its  associated  sum  of  squares  function 


(22) 


02S  2 

S(e)  =  2  (y  -0  -e  u) 

tl  U  1 


The  derivative  matrix  X  consists  of  the  elements  x 


ul 


u2 


9S(g) 

30  _ 


3S(e) 

■W~ 


and 


Clearly  the  elements  x  j  contain  neither  parameter 


and  hence  0^  is  said  to  enter  the  model  "linearly".  The  normal  equa¬ 
tions  associated  with  this  model  are 


0  2$ 

r)  0 .  +  2  e  U  =  2  y 


(23) 


9 2^u  ,  _  y  292«u  0  2^ ' 


2  i  0 ,e  >2*  e 
u  1  ’u 


=  ye 
u  uu 


The  first  of  these  equations  may  be  solved  for  0^  to  give 


(24) 


1_ 

n 


2  e 
u 


M, 


This  expression  for  0^  may  be  used  in  several  ways.  For  example, 
we  can  substitute  for  9  in  the  second  normal  equation  in  Eq.  (23) 
and  then  solve  for  0  _  by  trial  and  error.  Or,  since  we  now  have  an 
expression  in  0 only,  we  might  attempt  to  linearize  this  nqrmal 
equation  using  a  Taylor  Series  about  some  guessed  value  and 

determine,  in  a  fashion  analogous  to  the  Gaussian  Iterant,  corrections 
on  the  guessed  values.  Upon  substituting  0^  in  S(0)  we  obtain 
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S(9) 


2  [yu-y  -  (e 

u 


1_ 

n 


It  is  now  easy  to  calculate  S(8)  for  various  values  of  0^  and  to  deter  ¬ 
mine  the  minimum  S(9)  as  illustrated  in  Figure  7. 


FIGURE  7 

The  Non-Linear  Parameter 


Once  6_,  the  estimate  of  6,  giving  the  minimum  S(8)  is  obtained, 

can  be  determined  using  equation  (22).  In  general,  it  ia  always 
possible  to  solve  for  ail  the  linear  parameters  in  terms  of  the  non¬ 
linear  parameters  and  thus  reduce  the  search  for  the  minimum  of 
S(9)  to  one  involving  only  the  non-linear  parameters  (12), 


Confidence  Regions  for  0 

The  confidence  region  for  9  can  be  determined  (13)  as  in  the  case 

of  linear  models,  by  first  determining  that  value  of  the  S(0)  which 

would  just  produce  a  critical  value  of  F  .  The  problem  then 

p ,  v ,  o 

becomes  one  of  locating  the  contour  for  this  critical  value  of  S(0). 

This  can  be  accomplished  if  S( 8 )  has  been  determined  over  a  reason¬ 
ably  fine  lattice  of  points  throughout  the  parameter  space.  However, 
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as  mentioned  earlier,  cne  evaluation  of  S(0)  over  a  large  lattice  can 
be  quite  expensive  in  computation  time. 

An  approximate  confidence  region  can  be  constructed  by  first  con¬ 
verting  the  non-linear  model  into  an  approximate  linear  model  about 
the  least  squares  estimate  9.  The  variances  and  covariance  of  the 

'V  -  rr  .  1  o 

estimated  parameters  is  then  given,  approximately,  by  (X  X)-1  <r 
where  the  derivative  matrix  is  evaluated  at  the  point  6,  The  approxi¬ 
mate  confidence  region  for  the  0  is  then  given  by  the  quadratic  form 


(26) 


(e  -e)TxTx(e  -e)T 


p 


F 

p,  v,a 


Design  for  Non-Linear  Models 

The  problems  of  estimating  the  parameters  6  in  a  non-linear 
model  y^  =  f(<j;  ,  9)  +  e  have  been  briefly  reviewed.  We  turn  now  to 

the  problem  of  choosing  the  settings  of  the  variables  |  so  that  our 
estimates  of  the  9  are,  in  some  sense,  best.  One  criterion  for  a 
good  design  is  to  choose  the  levels  of  the  that  is,  construct  the 

T  -1 

design  matrix,  so  that  (X  X)  is  as  small  as  possible.  This  directs 
us  then  to  choose  £  so  that  the  determinant  |X  ^X  |  is  as  large  as 
possible.  G.  E,  P,  Box  and  H.  L.  Lucas  [l4j  employed  this  criteria 
for  the  construction  of  a  non-linear  design  in  an  early  paper  by 
considering  the  special  case  where  n  =  p,  the  number  of  runs  equals 
the  number  of  parameters.  For  this  special  case  |  X^X  |  =  |  X  | 

Thus  the  problem  becomes  one  of  determining  the  levels  of  f  so  as 
to  maximize  the  determinante  |X  j  . 

For  example,  suppose  the  model  is  r|  =  6^e  !  Then  the 

determinant  of  the  matrix  of  derivatives  X  becomes 


X| 


ee24i 

ee2*  2 


91* 


11 


12 


,e2  £  1 

-92^  2 


(26) 


22 


Design  of  Experiments 


where  £  and  £  are  the  two  settings  oi  to  be  deic  iiiuiitui  Clearly 
initial  guesses  of  the  parameters  6  *  and  0  *  are  necessary  before 

l  t 

these  levels  of  £  can  be  determined,  Let  £.<£<£  be 

•  min  max 

the  admissible  range  of  £  .  Then  if  the  model  represents  an  exponential 

decay  (0  is  negative)  we  find  that  |  X  |  is  maximized  when 
£  ^ 

f,  s  .  and  £  =  £,  .  .  +  l/o*  .  Thus  if  n,  is  the  response  at 

1  (min)  2  ’(min)  '  2  1 

£  .  .  the  initial  response,  the  experimenter  is  instructed  to  take  the 

(min)  ^  | 

next  observation  when  p  =  e  =  06.8%  of  p .  If  the  model  represents 

exponential  growth  (0  is  positive),  then  |  X|  is  maximized  by  setting 

£ 

£  _  a  £  and  £  =  £  -  l/6*.  Thus  we  should  take  our  first 

2  max  1  max  2 

observation  when  p  =  e'^  =  36.  8%  of  the  response  at  £  ,  Box  and 

max 

Lucas  discuss  design  problems  associated  with  other  simple  non-linear 
models.  In  another  paper  [15]  ,  Box  and  W.  G.  Hunter  discuss  the 
general  problem  of  experimental  design  for  non-linear  models  with  the 
two  objectives  of  i)  establishing  the  form  of  the  model  and  the 
ii)  estimating  the  parameters  in  the  model  most  precisely. 

Of  course,  for  n  =  p  the  values  £  ^  and  £ 2  that  maximize  |X| 

could  have  been  determined  by  trial  and  error  using  a  fast  computer 
once  0*  and  9*  were  given  by  choosing  a  lattice  of  values  £  ^  and 

£  and  determining  the  contours  of  |  X  I  as  illustrated  in  Figure  8. 

£ 

FIGURE  8 

Contours 

°l  l?l 

Choose  different  values 

^1  and  ^2 

Evaluate  |  X  | 

Determine  contours  of  j  X  | 

Choose  £,  and  £„  for 
>.  1  2 

|  X  |  maximum 
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This  brute  force  method  can  easily  be  extended  to  more  settings  of  £  . 

In  fact,  those  doing  such  computations  will  lind  that  the  levels  ol  4  will 
usually  merely  replicate  themselves  for  n  >  2.  Further,  models  with 
p  parameters  will  produce  designs  with  n  =  p  points.  In  all  of  this, 
the  initial  guessed  values  6>|C  must  be  available. 

In  a  second  report  [16]  Box  and  W.  G.  Hunter  discuss  the  problem 
of  sequential  non-linear  designs.  Here  we  begin  with  n  observations 
the  results  of  a  model  ^  =  f(£  ,  9 )  +  ^  and  the  n  x  k  design  matrix 
By  the  methods  of  non-linear  estimation  we  can  then  determine  the 
least  squares  estimate  §  of  the  p  parameters.  Knowing  6  we  may 
compute  the  n  predicted  values  ^  =  f(fj^,6)  and  finally  the  n  x  1 

vector  of  residuals  R  =  We  may  also  compute  the  n  x  p  elements 

^  A  T 

of  the  derivative  matrix  X  evaluated  for  0  =  0.  Let  C  =  I  X  X  I  be 

the  determinant  of  the  p  x  p  matrix  X  X  .  We  now  require  4  the 

~n  ~n  <n+l 

settings  of  the  k  controlled  variables  for  experiment  n+1.  As  earlier, 
subject  to  the  experimental  constraints  on  the  variables  we  wish  to 
maximize  the  determinant 


(27) 


C„+l  *  14+1  5n«! 


Now  C 


n+1 


=  C 


+  x 

■Ml+1 


T 

*n+l 


where  x  is  the  (lxp)  row  vector 


'n+1 


^Xn+l,l’  Xn+1,2’  *’*'  Xn+l,p^ 


and  where  the  j~  element  x 


n+1 ,  j 


is  the  derivative  of  the  function 


f(4,  0)  with  respect  to  0,  evaluated  at  0  ,  that  is  x 


n+1 


to  maximize  C  we  now  choose  a 
n+1  n+1 


To  determine  the  settings  4 

lattice  of  points  in  the  space  of  the  k  controlled  variables  i|,  and  by 
determining  at  each  of  these  lattice  points,  locate  that  setting 

which  minimizes  Since  we  already  know  this  calculation  is 

not  quite  as  onerous  as  might  at  first  seem, 


?<D> 
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The  following  example  is  from  the  Box  and  W.  G.  Hunter  report. 
The  non-linear  model  under  study  is 


(28) 


V£i 

ITiTVVT 


The  two  controlled  variables, 


are  constrained  to  lie  in  the 

2 


4iand  ki 

interval  0  to  3.  An  initial  experimental  design,  consisting  of  a  2 
factorial,  was  first  employed  to  obtain  data  to  help  estimate  the  three 
parameters.  The  design  levels  and  response  were; 


(29) 


k  i  k  2  X 

1  1  0.126 

2  1  0.219 

1  2  0.07b 

2  2  0.126 


To  begin  the  non-linear  estimation  computation  the  initial  guecsed  values 
of  the  parameters  were  6^°'  =  2.9;  9  =  12.  2  and  0.69. 

The  least  Bquare  estimate s  I'"  were  ~  10.  39;  02  =  48.83  and  0^  =  0,74. 

These  estimates  of  the  parameters  were  then  used  to  compute  the 

elements  in  the  derivative  matrix  X  . 

~  n 


To  determine  the  location  of  the  fifth  experiment  the  determinant 

C  was  estimated  for  a  grid  of  values  of  £.  and  £ 
n+l  °  1  2 


^  2 

!C„J  ■ 

Cll+X15  C12+X15X25 
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3 

FIGURE  9 
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The  maximum  of  Cm  occurs  at  £  ,,  =  0.1  and  £  __  =  0.  0.  The  next 
a  31  34 

experiment  gave  =  0. 186  and  the  new  estimates  (using  the  as  the 
initial  guessed  values  in  the  iteration)  were  §  =  3.11;  e  -  15.19  and 

9  ^  =  79.  We  now  begin  anew.  was  maximum  at  £  =  3.  0  and 

y ^  =  0.  606  and  the  newest  estimates 

^  -  ~r  .  /U|  v  £  —  .  ^  t  uiiu  V  ^  —  ^  ^ 


£  ^2  =  0-  The  new  observation  was 

9 .  =  3.  96;  Q  =  15.  32  and  $  =  0.  66.  Box  and  Hunter  proceeded  until 


n  =  13,  Of  very  considerable  interest  is  the  fact  that  the  nine  experi¬ 
ments  following  the  initial  2^,  grouped  themselves  into  three  regions 
in  the  space  of  £ ^  and  £  .  These  regions:  A,  B,  and  C  are  noted 


in  Figure  9.  These  three  regions  roughly  define  the  "intrinsic"  design 
configuration  for  the  model  and  proposed  experimental  region. 


The  criteria,  maximize  |  X  X  |  is  certainly  not  the  only  one  an 
experimenter  might  propose.  ^or^ example,  an  experimenter  might 

T  -1 

wish  to  minimize  the  trace  of  |  X  X  |  ,  or  propose  values  for  various 

T  ~  ~ 

elements  in  the  X  X  matrix.  The  problem  now  would  be  one  of  choosing 
the  settings  of  the  £^,  for  n  fixed,  to  satisfy  these  constraints, 

that  is,  given  XTX  can  we  determine  |?  Box  and  W,  G.  Hunter  solve 
this  important  problem  for  the  special  case  of  p  =  k+1  in  their  report. 


Although  the  way  forward  to  the  construction  of  non-linear  designs 
has  been  indicated  by  the  work  of  G,  E.  P.  Box  and  W.  G.  Hunter,  the 
applications  of  these  methods  is  only  begun.  It  is  evident  that  designs 
will  have  to  be  constructed  for  each  model  and  experimenter,  since 
initial  guessed  values  of  the  non-linear  parameters  are  required.  The 
question  of  how  sensitive  a  derived  design  is  to  fluctuations  in  the 
initial  guesses  is  largely  unanswered,  and  many  more  questions  could 
be  posed.  One  thing  is  certain  the  arts  of  experimental  design  continue 
to  grow  rapidly, 
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ABSTRACT.  A  technique  is  discussed  for  framing  a  reliability  model 
in  terms  of  variables  data  rather  than  attribute  data.  A  particular  model 
is  developed  in  terms  of  a  Gamma  process;  it  is  believed  that  the  model 
may  prove  applicable  to  items  undergoing  long  term  storage,  especially 
where  continuous  observations  are  not  feasible.  Estimates  of  the  para¬ 
meters  of  the  model,  along  with  a  discussion  of  procedures  for  control  and 
verfication  are  included. 
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Vol.  10,  No.  2,  January  1964. 


SYSTEMATIC  METHODS  FOR  ANALYZING  2° 3™  FACTORIAL 

EXPERIMENTS* 

Barry  H.  Margolin 
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I 

ABSTRACT,  Two  systematic  procedures  to  facilitate  the  analysis  of 
a  complete  2n3m  factorial  experiment  are  presented.  The  methods  are 
applicable  when  all  the  quantitative  three -level  factors  are  equally  spaced 
and  when  the  contrasts  involving  qualitative  three-level  factors  appear  as 
if  the  three -level  factors  were  in  fact  quantitative  and  equally  spaced, 
Algorithm  I  systematizes  the  calculation  of  the  factor  effects  for  the 
2n3m  series  of  designs.  Algorithm  II  yields  the  set  of  fitted  values, 
and  hence  the  residuals,  based  on  those  factor  effects  which  have  been 
judged  to  be  non -negligible.  The  two  algorithms  have  additional  and 
possibly  more  important  uses  in  studying  fractionated  2n3m  factorial 
experiments.  Algorithm  I  can  be  used  to  facilitate  the  writing  down  of 
the  cross-product  matrix  for  a  desired  set  of  factor  effects  for  a  specified 
set  of  treatment  combinations.  For  the  special  case  of  the  standard 
fractionated  2n"P  series  of  designs  the  two  algorithms  can  be  used  to 
find  the  set  of  defining  contrasts  corresponding  to  a  given  set  of  treat¬ 
ment  combinations  or  to  find  the  set  of  treatment  combinations  correspond¬ 
ing  to  a  given  set  of  defining  contrasts. 

1.  INTRODUCTION.  In  his  oft-quoted  bulletin  in  1937  on  the  design 
and  analysis  of  factorial  experiments  Yates  [7]  presented  two  systematic 
tabular  algorithms  for  the  2n  series  of  factorial  designs,  i.  e.  ,  designs 
for  studying  n  two -level  factors.  The  algorithms  presented  were  for  the 
calculation  of  the  factor  effects  and  the  calculation  of  the  fitted  (predicted) 
values  based  on  those  factor  effects  judged  to  be  non-negligible.  Davies 
[4]  extended  the  first  procedure  for  calculating  factor  effects  to  the  3m 
series  of  designs,  i.e.  ,  designs  for  studying  m  three-level  factors. 

These  methods  have  enabled  the  factorial  experimenter  who  lacks  a  high 
speed  computer  to  save  a  considerable  amount  of  time  and  effort  in  his 
data  analysis.  Even  where  a  computer  has  been  available,  it  has  usually 
proven  beneficial  to  program  the  algorithms  as  opposed  to  the  standard 
method  of  analysis.  This  paper  presents  two  procedures  for  calculating 


‘''This  work  was  begun  while  the  author  was  a  summer  employee  of  the 
United  States  Army  Electronics  Command,  Fort  Monmouth,  during  the 
period  6/65  -  9/65. 
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factor  effects  and  fitted  values  for  the  2n3m  series  of  complete  factorial 
designs.  In  addition,  the  algorithms  have  further  applications  to  the  study 

*  0  •  •  ’  m  r  ,  '  i  ^  •  f  •  _  1 1  .  i  ^  . 
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of  designs. 


2.  THE  MODEL.  Throughout  this  paper  we  will  be  dealing  with  a 
factorial  experiment  in  which  n  factors  are  studied  at  two  levels  each 
and  m  factors  are  studied  at  three  levels  each.  Unless  it  is  stated  to 
the  contrary  the  experiments  will  be  complete  factorials.  In  addition, 
the  effects  attributable  to  a  three-level  factor  and  its  interactions  will 
be  broken  into  the  usual  single  degree  of  freedom  components,  namely, 
a  linear  component,  a  quadratic  component,  and  interactions  involving 
these  components.  This  breakdown  of  an  effect  into  its  single  degree  of 
freedom  components  is  discussed  elsewhere  by  Davies  [4]  . 

Let  us  adopt  the  following  notation:  Designate  the  n  two -level 
factors  by  letters  A,  B,  ....  and  the  m  three-level  factors  by  letters 
R,  S,  . .  .  .  The  main  effects  of  the  two-level  factors  will  be  indicated 

by  the  same  capital  letters  used  to  indicate  the  factors.  Thus,  for 
example,  A  will  indicate  either  factor  A  or  the  main  effect  of  factor 
A.  It  will  always  be  clear  from  the  context  of  the  discussion  which  inter¬ 
pretation  is  desired.  The  two  main  effect  components  of  a  three-level 
factor  will  be  indicated  by  the  capital  letter  indicating  the  factor  plus  a 
subscript  L  or  Q,  depending  upon  whether  we  wish  to  denote  the  linear 
or  quadratic  component,  e.  g.  ,  R^  will  denote  the  linear  effect  of  factor 

R.  A  single  degree  of  freedom  component  of  a  multi-factor  interaction 
effect  will  be  designated  by  a  "word"  consisting  of  the  capital  letters  with 
subscripts  where  necessary,  corresponding  to  the  factors  interacting, 
Thus,  ABR^Sq  will  denote  the  single  degree  of  freedom  effect  corre¬ 
sponding  to  the  interaction 

(A)  X  (B)  X  (linear  R)  X  (quadratic  S). 


Finally,  p  will  designate  the  grand  mean,  i.  e.  ,  the  average  of  the 
expected  values  of  all  treatment  combinations  in  the  full  factorial. 

In  the  model,  the  expected  value  of  the  response  to  the  (i)th  treat¬ 
ment  combination,  say  E(y,),  i  s  1,2 . 2n3m  ,  is  expressible  as  a 

linear  combination  of  all  the  main  and  multi-factor  interaction  effects 
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olus  the  grand  mean.  To  illustrate  the  model  for  the  ?  3  design,  let 
A,  B  and  R  be  the  two  two-level  factors  and  the  three -level  factor 
respectively.  Then  we  assume: 


E(y.)  =  „X  ♦  (A)Xa.  +  (B)XBi  ♦  (Rl)X  1+(Rq)X  *<AB)Xab. 

^  u 


+  <ARR>XAK,  i*(ARQ>XAR0i+<BRL)XBRl  i+  (BRQ>XBRni 

ij  w  Li  w 


+  (ABRl)Xabr  t(ABRQ)XABR  ,  i 
L»  Q 


1,2 . 12 


We  also  assume  that  the  variance  of  each  observation  y  is  constant, 

2  * 
say  <r  ,  and  that  the  observations  are  independent. 

The  values  of  the  coefficients  X  , ,  X..,.,.,X  , ,  i  =  1,2,,  ..,12, 

fil  AI  A.B.K..-1 

q. 

are  determined  by  the  settings  of  the  factors  A,B  and  R  for  the  (i)th 
treatment  combination  as  follows: 


‘>  V“  •  . 12- 

2)  If  factor  A  is  at  its  low  level,  X  .,=  -1;  otherwise,  X  .  .  =1  . 

'  Ai  Ai 

3)  If  factor  B  is  at  its  low  level,  X  ,  =  -1  ;  otherwise,  X_,  =  1  . 

Bl  Bl 

4)  If  factor  R  is  at  its  low  level,  X_  .  =  -1  and  X_  .  =  1. 

RLi  RQi 

5)  If  factor  R  is  at  its  intermediate  level,  X  .  =  0  and  X  =  -2. 

6)  If  factor  R  is  at  its  high  level,  X  ,=  1  and  X  .  =  1  . 

V  V 

7)  The  coefficient  corresponding  to  any  interaction  will  have  a 

value  equal  to  the  product  of  the  coefficients  of  those  factor  effect 

components  which  are  interacting,  e.  g.  ,  X  =  XXX  ,  . 

AdKqI  Al  Bl  Rq! 
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If  we  let  EJY)  =  (E(y.),...,  E(y12)) 


VA.  <=. 


f*  i  -  ^ 

A  U  kltv 


vector  of  q  )  and 


P  =  (h-  .  A,  b . abr  ), 

^  w 


then  the  model  can  be  reformulated  as:  E(Y)  =  pX1  ,  with  independent 
observations  of  common  variance.  '  “ 

Algorithm  I,  presented  in  the  next  section,  enables  the  calculation 
of  ,  the  estimate  of  ji  ,  in  just  one  tabular  operation. 

3.  CALCULATION  OF  THE  FACTOR  EFFECTS.  We  revert  to  the 

general  case  of  a  2n3m  design.  For  the  levels  of  the  factors,  we  need 
the  following  notation;  Let  0  and  1  designate  the  low  level  and  high 
level  respectively  for  each  two-level  factor.  Let  0,1  and  2  designate 
the  low,  intermediate  and  high  levels  respectively  for  each  three-level 
factor. 

Now  every  treatment  combination  can  be  identified  with  an  (n+m)- 
place  integer,  possibly  beginning  with  zero.  The  integral  value 
corresponding  to  a  treatment  combination  will  have  a  0  or  1  in  the  first 
place,  depending  upon  the  level  of  the  A  factor;  it  will  have  a  0  or  1 
in  the  second  place,  depending  upon  the  level  of  the  B  factor,  and  so 
on  for  the  first  n  places  corresponding  to  the  n  two -level  factors.  The 
(n+l)st  place  will  contain  a  0,1,  or  2,  depending  upon  the  level  of  the 
R  factor,  and  so  on  for  the  m  places  corresponding  to  the  m  three - 
level  factors. 

We  now  define  a  column  of  treatment  combinations  to  be  in  standard 
order  if  the  corresponding  column  of  (n+m)-place  "integers"  is  in 
ascending  order  of  magnitude.  The  systematic  method  for  the  calcula¬ 
tion  of  the  factor  effects  is  a  direct  combination  of  the  methods  known 
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for  the  2n  and  3m  series  [7,4]  .  Write  down  in  a  column  the  treatment 
combinations  in.  standard  order  Tn  the  adjacent  column  enter  the  observed 
responses.  Consider  this  column  of  observed  responses,  usually  called 
column  zero,  and  each  of  the  succeeding  m-1  columns  as  consecutive  sets 
of  three  values.  Then; 

(i)  For  each  set,  form  the  sum  of  the  three  numbers  (y  +  y  +  y  ) 

1  fa  J 

and  enter  these  values  in  order  in  the  next  column  (column  I). 

(ii)  Form  the  difference:  the  third  element  minus  the  first  element 
(y^-yj)  for  every  set,  and  enter  these  values  in  order  in  column  1  under 

the  sums  just  calculated. 

(iii)  Form  the  sum  of  the  first  and  third  values  minus  twice  the  second 
value  in  every  set  (y^  -  2y^  +  y  )  and  enter  these  numbers  in  order  in 

column  I,  which  is  now  completed. 

(iv)  Repeat  the  above  three -step  operation  m-1  times,  so  that  it  has 
been  performed  m  times  in  all. 

Now  consider  this  last  column  arrived  at  after  (iv)  and  the  following 
n-1  columns  as  consecutive  sets  of  two  elements. 

(v)  For  each  set  form  the  sum  of  the  two  values  (x^  +  x^)  and  enter 

these  values  in  order  in  the  next  column.  1 

(vi)  Then  form  the  difference:  the  second  number  minus  the  first 
(x^  -  Xj)  for  each  set,  and  enter  these  values  in  order  under  the  sums 

just  calculated  in  (v). 

(vii)  Repeat  this  two-step  operation  n-1  times,  so  that  it  has  been 
performed  n  times  in  all. 

The  final  column  now  contains  the  contrast  sums  (not  effects)  for 
the  factor  effects  in  standard  order.  Standard  order  of  the  factor  effects 
2  2 

for  a  2  3  ,  for  example,  is;  total,  SL>  Sq,  R^,  R^S^,  R^Sq,  Rq, 

Vl’  Vcr  B'  bsl’  brqsq'  a'  asl - ABVa  M  • 
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To  calculate  the  factor  effects  (not  the  standardized  factor  effects) 
one  must  divide  each  contrast  sum  by  its  appropriate  divisor.  This 
divisor  is  given  by 


Divisor 


=  ?. 


n+i 


3m-p 


where  i  =  number  of  three -level  factors  in  the  effect,  e.  g,  ,  for  ABR  S  , 

i  =  2;  and  where  p  =  number  of  linear  terms  of  three  level  factors  in  the 

effect,  e.  g.  ,  for  ABR  S~  ,  p  =  1. 

Ld  Li 

To  calculate  the  sum  of  squares  for  any  effect,  square  the  corre¬ 
sponding  contrast  sum  and  divide  by  the  above  divisor,  or  square  the 
effect  and  multiply  by  the  above  divisor. 


By  way  of  clarification  of  the  above  exposition  consider  the  following 
tabular  analysis  of  a  contrived  2^  31: 


Example  I 


A 

B 

R 

Response 

I 

II 

III 

Divisor 

Effect 

Effect 

Name 

Sum  of 
Squares 

0 

0 

0 

28 

99 

234 

360 

12 

30 

Mean 

10,800 

0 

0 

1 

27 

135 

126 

64 

8 

8 

rl 

512 

0 

0 

2 

44 

21 

52 

120 

24 

5 

rq 

600 

0 

1 

0 

36 

105 

-11 

120 

12 

10 

B 

1,  200 

0 

1 

1 

27 

16 

72 

56 

8 

7 

brl 

392 

0 

1 

2 

-11 

-2L 

-41 

72 

-  24 

3 

brq 

216 

1 

0 

0 

14 

-12 

36 

-108 

12 

-9 

A 

972 

1 

0 

1 

5 

,J,± 

.84. 

-40 

8 

-5 

arl 

200 

1 

0 

2 

2 

18 

20 

-24 

24 

-1 

arq 

24 

1 

1 

0 

30 

-54 

-11 

48 

12 

4 

AB 

192 

1 

1 

1 

21 

6 

36 

16 

8 

2 

abrl 

32 

1 

1 

2 

54 

42 

36 

0 

24 

0 

ABRq 

0 

Total 

sum  of  squares  = 

15,140 
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Two  final  comments  on  this  algorithm  are  in  order: 

(l)  To  calculate  standardized  effect"  (constant  variance),  to  be 
used,  for  example,  in  half-normal  plotting  [3]  ,  one  must  divide  the 
elements  of  the  column  of  contrast  sums  by  the  square  root  of  the 
appropriate  divisor  presented  previously.  ,<i- 

(ii)  If  m  =  0,  this  procedure  reduces  to  the  ^£*fe s  method  for 
the  2n  series;  if  n  =  0,  this  procedure  reduces.-fco'the  Davies  technique 
for  the  3m  series  [7,4]  .  **’**'' 

4.  CALCULATION  OF  THE  FITTED  VALUES.  We  observed 
previously  that  the  result  of  the  first  algorithm  is  a  column  of  factor 
effects  in  standard  order.  One  can  then  judge  these  effects  as  to  their 
significance,  either  by  a  half-normal  plot  employing  the  standardized 
effects,  or  by  the  usual  analysis  of  variance  using  the  calculated  sums 
of  squares.  One  need  next  calculate  the  fitted  values  and  the  set  of 
residuals  (the  observed  response  minus  the  fitted  value).  This  enables 
one  to  check  in  detail  the  fit  of  the  equation  based  on  the  significant 
effects  to  the  observed  data.  For  this  purpose  we  propose  the  follow¬ 
ing  tabular  algorithm: 

(i)  Write  down  the  column  of  effects  (contrast  sums  divided  by 
appropriate  divisor)  in  standard  order,  replacing  those  judged  to  be 
negligible  by  a  zero. 

(ii)  As  in  the  first  algorithm,  regard  the  numbers  in  this  column 
and  the  succeeding  m-1  columns  as  consecutive  sets  of  three  values. 
For  each  set,  form  the  sum  of  the  first  and  third  elements  minus  the 
second  element  (y.-y  +y  )  and  enter  these  values  in  order  in  the  next 

.  i  u  J> 

column. 

(iii)  Next,  form  the  difference;  the  first  element  minus  twice 
the  third  element  in  each  set  (y^  -  2y^)»  and  enter  these  numbers  in 

order  under  the  values  calculated  in  the  previous  step. 

(iv)  Form  the  sum  of  the  elements  in  each  set  (y^+y^+y^)  and  enter 
these  values  in  order  in  the  remaining  spaces  in  the  next  column. 
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(v)  Repeat  this  three-step  operation  m-1  times,  so  that  it  has  been 
performed  m  times  in  all. 

(vi)  Invert  this  last  column.. 

(vii)  Consider  this  new  column  and  the  succeeding  n-1  columns  as 
consecutive  sets  of  two  numbers,  For  each  set,  form  the  sum  of  the  two 
values  (x.  +  x  )  and  enter  these  values  in  order  in  the  next  column. 

(viii)  Form  the  difference:  the  second  number  minus  the  first 
number  in  each  set  (x^-x^),  and  enter  these  values  in  order  under  the 

sums  calculated  in  (vii). 

(ix)  Repeat  this  two-step  operation  n-1  times,  so  that  it  has  been 
performed  n  times  in  all. 

(x)  Invert  this  last  column. 

The  resulting  column  contains  the  fitted  values  in  standard  order. 

If  our  procedure  is  valid,  applying  it  to  the  calculated  effects  of  the 
earlier  example  should  yield  the  initial  observations  or  responses  in 
their  standard  ordering.  This  is  presented  below: 


Example  2 


Effect 

1 

1  inverted 

II 

111 

III  inverted  =  Fitted  Value 

Mean 

30 

27 

6 

-9 

54 

28 

rl 

8 

6 

-15 

63 

21 

27 

rq 

-L 

-5 

20 

-3 

30 

44 

B 

10 

2 

43 

24 

2 

36 

brl 

7 

20 

4 

-3 

5 

27 

brq 

4 

_=J_ 

33 

‘  14 

72 

A 

-9 

-7 

4 

-21 

72 

14 

AR  - 

.Li 

-5 

4 

20 

23 

27 

5 

ARq 

-1 

43 

2 

-11 

36 

2 

AB 

4 

20 

-  5 

16 

44 

30 

ABR, 

2 

-15 

6 

-7 

27 

21 

ABRq 

0 

6 

27 

21 

28 

54 
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Thus,  the  original  set  of  responses  is  recovered,  and  it  is  in  standard 
order.  Hence,  algorithms  I  and  II  operate  in  an  inverse  manner. 

Ooserve  that  for  the  c,  bcnco,  i.  c.  ,  m  —  \j  ,  mio  ^iulcuUic  reduce 
to  the  method  presented  by  Yates  [7]  for  calculating  fitted  values.  One 
first  inverts  a  column  of  factor  effects  in  standard  order,  where  zeros 
have  replaced  the  negligible  effects.  Then  one  performs  the  calculations 
required  in  algorith  I  for  the  2n  series.  Finally,  another  column  inver¬ 
sion  is  required.  The  end  result  is  a  column  of  fitted  values  based  on 
the  significant  effects  and  it  appears  in  standard  order. 

Algorithms  I  and  II  have  been  presented  without  proof,  but  their 
validity  can  be  verified  by  a  rather  untidy  argument  using  matrix  theory, 
or  by  an  inductive  argument.  While  the  proofs  have  been  omitted,  one 
should  observe  that  the  relationship  between  algorithms  I  and  II  is  much 
more  direct  than  it  appears.  Consider  steps  (i)  -  (iii)  in  algorithm  I;  they 
can  be  summarized  in  matrix  notation  as: 

(yl*  y2'  v?)  *  1  -1  1 

1  0  -2 

\l  1  1 

Next,  steps  (ii)-(iv)  in  algorithm  II  can  be  summarized  as: 

(V  y2’  y3>  ’  1  11 

-1  0  1 

1  -2  1 

Observe  then  that  the  second  3X3  matrix  is  merely  the  transpose  of  the 
first  3X3  matrix.  In  a  similar  fashion,  steps  (v)  and  (vi)  in  algorithm  I 
can  be  summarized  as; 


(x,,  x2) 


1  -1 

1  1 
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Also,  steps  (vi)  -  (viii)  and  (x)  in  algorithm  II  can  be  summarized  as: 


(xr  x2)  • 

0 

1 

1  -1 

0  M 

1 

0 

1  1 

1  o) 

The  product  of  the  three  2X2  matrices  directly  above  is:  /  1  1 

1-1  1 

This  is  the  transpose  of  the  first  2X2  matrix  above.  This  matrix  rela¬ 
tionship  is  hot  accidental;  it.  generalizes  as  follows;  Let  M  denote 
the  matrix  of  coefficients  which  operates  on  the  right  of  the  lX2n3m 
data  matrix  in  a  complete  factorial  and  produces  the  matrix  of  contrast 
sums.  Then  M/  operating  on  the  right  of  the  lX2n3m  matrix  of  the 
grand  mean  and  the  set  of  effects  (not  standardized),  where  zeros  have 
replaced  the  negligible  effects,  produces  the  matrix  of  fitted  values., 

5.  FRACTIONATED  2n3m  FACTORIAL  EXPERIMENTS.  Frac- 
tionating  the  2n3m  series  of  factorial  designs  has  not  proven  to  be  an 
easy  proposition.  Webb  [8]  has  presented  a  fairly  thorough  review  of 
the  work  that  has  been  done  in  this  area;  however,  there  appears  to  be 
room  for  further  exploration  and  study.  No  attempt  will  be  made  in  this 
paper  to  produce  new  fractions  of  the  2n3m  series.  We  present,  rather, 
a  procedure  based  on  algorithm  I  for  writing  down  the  cross-produce 
or  normal  matrix  for  any  desired  set  of  factor  effect  estimates  broken 
into  the  usual  single  degree  of  freedom  components,  given  a  specified 
fractional  set  of  treatment  combinations.  The  method  presented  is  far 
superior  to  the  tedious  sums  of  squares  and  cross-products  calculation 
usually  used  to  determine  the  elements  of  the  cross-product  matrix 
each  time  an  altered  set  of  factor  effects  is  to  be  considered.  This  will 
speed  the  evaluation  of  new  designs  by  criteria  to  be  discussed  later, 
and  will  facilitate  the  calculation  of  the  desired  estimates  and  evaluation 
of  the  proposed  model. 

We  retain  the  model  presented  for  the  full  factorial;  however,  in 
a  fractionated  experiment  we  are  restricted  to  obtaining  estimates  of 
only  a  subset  of  the  set  of  all  single  degree  of  freedom  effects  possible 
in  the  full  factorial.  Note  that  in  a  full  factorial  one  may  be  interested 
also  in  only  a  subset  of  the  set  of  effects  possible,  but  that  is  by  choice. 
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Those  effects  which  are  of  no  interest  or  cannot  be  estimated  are  then 
suppressed  by  assuming  them  to  be  zero  in  the  model.  In  addition,  in 
a  fractionated  experiment  we  no  longer  have  2n3m  treatment  combina¬ 
tions  to  run,  but  a  smaller  number,  say  N.  Hence,  if  we  are  interested 
in  the  subset  of  effects,  both  main  and  interaction,  designated  by 
(p,  a,  p,  ....  w),  the  model  is 


E(y.)  =  pX  .  +  aX  .+  pXn.  +  . . .  +  uX  . ,  i  =  1 . N, 

wr  in  ai  pi  ui 

where  p  is  the  grand  mean,  and  the  observations  are  independent  with 

variance  <r^  .  The  coefficients  X  ,,X  .,  ...,X  .  are  determined  as 

pi  ai  wi 

before  by  the  settings  of  the  factors  for  the  (i)th  treatment  combination. 

DEFINITION.  X  =  (X . .  X  will  be  called  the  indicator 

a  '  al  aN7  - 

variable  corresponding  to  the  effect  a. 


DEFINITION,  Two  indicator  variables  X  and  X„  will  be  said  to 
be  orthogonal  for  the  fractional  factorial  if  1 

N 

2  X  .Xa.  =  0  ; 

i,i  “ 


otherwise,  they  will  be  said  to  be  entangled.  (We  have  purposefully 
avoided  using  the  ambiguous  term  "confounding". )  As  a  consequence  of 
our  particular  model,  Xq  and  X^  are  orthogonal  if  and  only  if 


N 

2  X 
i=l 


dpi 


0 


since 


N 

2  X 
i=l 


api 


N 

2 

i=l 


X  .X 
ai 


(3i 


To  be  able  to  handle  the  case  where  both  &  and  p  have  factor  compo¬ 
nents  in  common,  e.  g.  ,  a  =  AR  S  and  P  =  ABR.  S  ,  we  need  to 

Li  Li  L  U 

extend  the  notation  of  an  indicator  variable  to  allow  subscripts  containing 

2  2 

such  meaningless  symbols  as  R^  ,  ,  and  A  ,  This  will  be  purely 

for  convenience  so  that,  for  example,  we  can  write 
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L  L  L  U 


=  X 


a2bhLslsq 


DEFINITION.  Effects  a  and  (3  will  be  said  to  be  entangled  for  the 
fractional  factorial  if  their  corresponding  indicator  variables  are 
entangled. 


Note  that  aliasing  of  effects  a  and  p  is  the  special  case  of  entangl¬ 
ing  where  either  X  =  X.  or  X  =  -X.  . 

6  a  p  a  (3 

N 

DEFINITION.  If  £  X  .  i  0,  then  X  will  be  said  to  be  an 
.  .  ai  a 

i=l 

entangling  contrast  for  the  design. 


It  is  clear  that  if  X^  is  an  entangling  contrast,  then  X^  is 
entangled  with  X  ,  and  hence,  a  is  entangled  with  the  grand  mean  p  . 
It  should  also  be  clear  that  defining  contrasts,  as  defined  for  the  frac¬ 
tionated  2n"P  series  of  designs  in  [2]  ,  are  merely  special  cases  of 
entangling  contrasts  where  either  X^  =  1  for  i  =  1,  .  .  .  ,  N,  or 

X  .  =  -1  for  i  =  1, .  ,  .  ,  N,  and  hence 
ai 


2n-p 

£  X  =  +  2n"P 
ai  — 


6.  CORRELATION  AND  ORTHOGONALITY.  The  normal  or  cross- 
product  matrix  for  a  fractional  factorial,  necessary  for  least  squares 
estimation,  requires  simply  the  sums  of  squares  and  cross-products  of 
the  indicator  variables  corresponding  to  the  desired  subset  of  effect 
estimates.  The  normal  matrix  is  singular  if  and  only  if  the  set  of 
indicator  variables  involved  is  linearly  dependent.  In  this  case  we  say 
that  the  set  of  effects  is  non-e stimable .  The  only  way  to  circumvent 
this  problem  and  achieve  unique  estimates  is  to  suppress  a  sufficient 
number  of  effects  to  destroy  all  linear  dependencies. 

Let  us  assume  that  the  normal  matrix  is  non-singular.  Then  one 
is  interested  in  the  inverse  of  the  normal  matrix  for  purposes  of  estimation 


and  determining  the  correlation  between  estimates.  The  inverse  of  the 
normal  matrix  is  in  fact  the  covariance  matrix  between  effect  estimates. 

It  is  well  known  (see  [4]  ,  for  example)  that  if  the  set  of  indicator  variables 
is  completely  orthogonal,  i.e.  ,  any  two  indicator  variables  corresponding 
to  different  effects  are  orthogonal,  then  the  normal  matrix  and  the 
covariance  matrix  are  both  diagonal.  Hence,  the  correlation  between 
any  two  estimates  of  factor  effects  is  zero.  It  is  less  well  known  and 
deserves  repeating  that  orthogonality  of  a  pair  of  indicator  variables  is 
neither  necessary  nor  sufficient  for  the  corresponding  pair  of  estimates 
to  have  zero  correlation.  The  following  two  small  examples  will  illustrate 
this: 


I. 


Design 

Indicator 

Variables 

Run 

ABC 

X 

x . 

X„ 

1  *■“* “  — 

_JL 

A 

_ B 

_ c 

1 

0  0  0 

l 

-1 

-1 

-1 

2 

0  1  0 

l 

-1 

1 

-1 

3 

1  0  0 

l 

1 

-1 

-1 

4 

1  1  1 

l 

1 

1 

1 

The  normal  matrix  is: 

h 

0 

0 

-2  \ 

/ 0 

4 

0 

2 

0 

0 

4 

2 

\  -2 

2 

2 

4  I 

and  its  inverse,  the  covariance  matrix  is: 
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4  2  4  2 

1111 


4  .  A  *  .  1  2  *  A 

Thus,  even  though  2  X  X  =  cov\-^>  B)  =  —  <r  ,  where  A  and  B 

i=l 

denote  the  estimated  effects. 


Design 

Run  R_  S  A 

1  10  0 

2  0  10 

3  111 

4  2  2  1 


Indicator  Variables 


1  0 

1  -1 

1  0 

1  1 


-1  -1 

0  -1 

0  1 

1  1 


The  normal  and  covariance  matrices  are  respectively: 


3/4 
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A  A 

Thus,  cov{Rl>  Sl)  =  0,  but 


4 

Z  X 
i=l 


_  .Xe  . 

rl‘  V 


Both  these  small 


designs  were  intended  solely  for  illustrative  purposes,  but  either  might 
conceivably  arise  at  an  early  stage  of  some  experiment  in  which  the 
factors  are  introduced  sequentially  and  the  results  become  available 
sequentially. 


7 _  USES  OF  'THE  ENTANGLING  CONTRASTS.  One  needs  to  be 
able  to  calculate  the  normal  matrix  for  a  design  for  any  conceivable  set 
of  desired  estimates  for  both  estimation  and  evaluation  of  the  design. 

It  is  with  respect  to  this  task  that  the  entangling  contrasts  prove  useful. 
Consider  the  set  of  entangling  contrasts  corresponding  to  the  set  of  all 
possible  effects  for  the  fractional  factorial  under  consideration.  Then 
this  set  contains  succinctly  the  information  needed  to  write  down  the 
normal  matrix  corresponding  to  any  desired  set  of  effect  estimates. 

For  example,  suppose  that  a,  p  and  y  are  three  of  the  single  degree 
of  freedom  effects  we  are  interested  in  for  a  particular  fractionated 
2n3m  experiment.  Suppose  further  that  the  only  entangling  contrast 
for  the  experiment,  regardless  of  the  set  of  desired  effects,  is 
Thus, 

N  N 

=  Xa3yi  =  **  0'  whence’  XaiXf3yi  =  C‘ 


Since  we  are  interested  in  a,  p  and  y,  it  then  follows  that  Xq  is 
entangled  with  ,  and  that  the  cross-product  of  Xq  and  X^  is 

equal  to  c.  We  will  denote  the  cross-product  of  Xq  and  Xpy  by 

(Xo,  Xpy)  .  Hence,  (Xq,  Xpy)  =  c.  Similarly,  (Xp  ,  xQyHxy.  XaJ) 

_  /x  X  )  =  c.  We  shall  call  c  the  value  of  the  entangling  contrast. 

'  V  p.’  apy 

Finally,  we  know  that  since  X^  is  the  only  entangling  contrast, 

no  other  non-zero  cross-products  are  possible.  Hence,  we  can  write 
down  the  complete  normal  matrix  for  any  desired  set  of  effects  just 
from  the  knowledge  of  the  entire  set  of  entangling  contrasts.  It  turns 
out  that  we  don't  even  need  the  entire  set  of  entangling  contrasts.  This 
reduction  can  be  accomplished  by  use  of  the  following  easily  verified 
identities: 


A  A 


Fl  o  a  i  rr  r»  r\i  lTvnArimsftta 

-  .  -0  -  -  —  t  - 


i) 

ii) 

iii) 

iv) 


X.2  =  X  ,  where  A  is  a  two-level  factor; 

A  a  a 

X  2  =  (2/3)X  +  (l/3)X_  ,  where  R  is  a  three -level  factor; 

K  “a  a  R/-va 

X  2  =  2X  -  Xn  ,  where  R  is  a  three -level  factor;  and 

RQa  a  RQa 

X_  =  X_  ,  where  R  is  a  three-level  factor. 

RLRQa  RLa 


Hence,  we  need  not  calculate  directly  any  entangling  contrast  which  has 
squared  components  or  both  the  linear  and  quadratic  components  of  the 
same  factor  as  part  of  its  subscript.  The  remaining  subset  of  entangling 
contrasts  will  be  called  the  generating  set  of  entangling  contrasts.  Thus, 
once  we  have  determined  our  desired  effects,  we  can  process  to  write 
down  the  corresponding  normal  matrix  from  the  generating  set  of  entangl¬ 
ing  contrast. 


There  is  a  second  related  use  of  the  set  of  entangling  contrasts  for 
any  desired  set  of  effects.  Frequently  the  normal  matrix  can  be 
rearranged  bo  that  there  are  square  submatrices  (proper)  of  non-zero 
elements  down  the  main  diagonal  and  zeros  elsewhere.  Webb  [8]  has 
termed  such  designs  clumpwise -orthogonal  designs.  Such  a  rearrange¬ 
ment,  if  possible,  makes  it  easier  to  evaluate  the  determinant  of  the 
entire  normal  matrix  as  the  product  of  the  determinants  of  the  submatrices. 
Thus,  if  the  normal  matrix  is  singular,  one  can  localize  the  linear 
dependencies  by  determining  which  submatricos  are  singular.  The 
inversion  of  the  normal  matrix  is  also  facilitated,  for  one  need  only 
invert  each  of  the  smaller  submatrices.  Finally,  the  rearrangement 
allows  us  to  state  immediately  that  if  X^  and  X^  are  indicator  vari«f 

ables  whose  sums  of  squares  are  found  in  different  submatrices,  then 
cov  (aj)  =0  [4]  . 


The  breakdown  of  the  normal  matrix  is  accomplished  as  follows: 

Define  —  to  be  a  relation  between  indicator  variables  X  and  X„ 

a  p 


such  that  X 


X^  if  and  only  if  X^  is  entangled  with  X^,  or  if 


there  is  a  finite  chain  of  indicator  variables  in  the  desired  model,  say 


.  ,X 


such  that 


*1 


Xq  is  entangled  with  X^ 


1 


X^  is  entangled 
i 


with  X  ,,  i  =  1, .  .  .  ,  n-1 
Ai+1 


and  X  is  entangled  with  X, 
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It  should  be  clear  that  thi3  relation  is  an  equivalence  relation  for 
the  desired  set  of  indicator  variables  corresponding  to  the  desired  set  of 
effect  estimates.  Hence,  it  determines  equivalence  classes  or  disjoint 
subsets  of  the  set  of  desired  indicator  variables,  The  corresponding 
rearrangement  of  the  normal  matrix  by  equivalence  classes  will  accomplish 
the  desired  clumpwise -ortho  gonalization  of  the  normal  matrix.  In  practice 
this  is  an  easy  operation  to  perform. 


8.  DETERMINING  THE  GENERATING  SET  OF  ENTANGLING 
CONTRASTS.  We  intend  to  make  use  of  algorithm  I  for  calculating  the 
generating  set  of  entangling  contrasts,  For  any  effect  a,  algorithm  I 
forms  the  contrast  sum 


2n3m 


i  =  1 


y.x  • 

7i  ai 


Where  is  the  response  entry  in  the  (i)th  position  in  column  zero.  The 
contrast  sum  appears  in  the  final  column  in  the  position  designated 
for  a  in  the  standard  ordering  of  all  possible  effects  in  the  full  factorial. 
Let  us  consider  what  would  happen  if,  instead  of  uBing  (y, .....  y  )/ 

1  2n3m 

as  column  zero,  we  choose  to  have  (z., .  ,  .  ,  z  )  as  column  zero, 

•  1  XT1 

where  2  3 

Zj  =  1  if  the  (i)th  treatment  combination  in  the  standard 
order  was  run  as  part  of  the  fractional  factorial, 

i  =  1,...  ,  2n3m  , 

=  0  otherwise. 


Then  one  would  find  2  z.X^  appearing  in  the  position  for  a  in  the 

i=l 

final  column.  However  , 
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where  S  is  the  set  of  those  treatment  combinations  forming  the  given 
fractional  factorial.  Thus,  £  X  .  is  simply  the  calculation  we  need 

j  «  S  aJ 

to  determine  whether  or  not  X  is  an  entangling  contrast  for  the  frac¬ 
tional  factorial.  Thus,  algorithm  I  can  be  employed  to  find  the  generating 
set  of  entangling  contrasts  in  any  fraction  of  a  2n3m  design,  since  one 
sweep  of  algorithm  1  performs  the  calculation  of  £  X  .  for  all  possible 

j  t  S  aj 

effects  a  which  are  meaningful.  Practically  speaking,  if  the  number 
2n3m  is  relatively  small,  say  81  or  less,  this  procedure  works  well, 
and  the  bookkeeping  does  not  become  unreasonable  even  when  calculating 
by  hand. 

To  summarize  then  the  procedure  in  this  case,  one  writes  down  all 
the  treatment  combinations  in  the  full  factorial  in  standard  order.  One 
places  a  one  in  the  response  column  next  to  each  treatment  combination 
which  was  run  in  the  fractional  factorial  and  a  zero  in  each  of  the 
remaining  positions  in  column  zero.  One  then  proceeds  with  algorithm 
I  as  if  this  dummy  response  column  were  a  true  response  column  for  a 
full  2n3m  factorial.  As  in  a  full  Zn3m  factorial,  one  identifies  the 
final  column  with  a  column  of  effects  in  standard  order.  Now,  however, 
the  interpretation  of  the  final  column  will  differ  from  that  of  a  column 
of  calculated  contrasts.  If  there  is  a  non-zero  element  in  the  final 
column  next  to  any  effect,  then  the  corresponding  indicator  variable  is 
an  entangling  contrast  in  the  generating  set  with  the  value  of  the  non¬ 
zero  element.  For  example,  consider  the  following  fraction  of  a  2232 
consisting  of  12  runs: 


Run  number 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

A 

0 

0 

0 

0 

0 

0 

1 

1 

1 

1 

1 

1 

B 

0 

0 

0 

1 

1 

1 

0 

0 

0 

1 

1 

1 

R 

0 

1 

2 

0 

1 

2 

0 

1 

2 

0 

1 

2 

S 

0 

1 

2 

1 

2 

0 

1 

2 

1 

2 

0 

1 

Here  A  and  B  designate  as  usual  the  two-level  factors  and  R  and  S 
designate  the  three-level  factors. 


This  fraction  was  formed  by  setting  A  +  B  +  R  2  S  (mod  3). 


Then  the  procedure  to  find  the  generating  set  of  entangling  contrasts 
is  demonstrated  below: 


A 

B 

R 

S 

0 

i 

11 

III 

IV 

Contrast  name 

0 

0 

0 

0 

1 

l 

3 

6 

12 

Total 
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0 

0 

1 

0 

l 

3 

6 

0 

SL 
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0 

0 

2 

0 

l 

3 

0 

0 

SQ 

0 

0 

1 

0 

o' 

l 

3 

0 

0 

X 

0 

0 

1 

1 

1 

l 

0 

0 

-1 

RLSL 

0 

0 

1 

2 

0 

l 

0 

0 

3 

rlsq 

0 

0 

2 

0 

o 

i 

o 

0 

0 

Rq 

0 

0 

2 

1 

0 

l 

0 

0 

-3 

rqsl 

0 

0 

2 

2 

l 

l 

o 

1 

-3 

RQSQ 

0 

1 

0 

0 

0 

l 

0 

.  -2 

0 

B 

0 

1 

0 

1 

l 

l 

o 

3 

0 

bsl 

0 

1 

0 

2 

0 

l 

0 

0 

0 

BSq 

0 

1 

1 

0 

0 

-l 

o 

o 

0 

brl 

0 

1 

1 

1 

0 

0 

0 

0 

-3 

0 

1 

1 

2 

1 

l 

o 

-3 

-3 

brlsq 

0 

1 

2 

0 

T 

o 

0 

0 

0 

BRq 

0 

1 

2 

1 

0 

l 

2 

I 

3 

BRqSl 

0 

1 

2 

2 

0 

-l 

-1 

-6 

-9 

BRqSq 

1 

0 

0 

0 

o 

0 

-l 

0 

0 

A 

1 

0 

0 

1 

l 

l 

-l 

0 

0 

ASL 

1 

0 

0 

2 

0 

-i 

0 

0 

0 

asq 

1 

0 

1 

0 

o 

l 

3 

0 

0 

arl 

1 

0 

1 

1 

0 

-l 

3 

0 

-3 

arlsl 

1 

0 

1 

2 

i 

0 

-3 

0 

-3 

ARlSq 

1 

0 

2 

0 

T 

T 

0 

o 

0 

ARq 

1 

0 

2 

1 

0 

-2 

0 

0 

3 

ARqSl 

1 

0 

2 

2 

0 

1 

0 

-3 

-9 

ARqsq 

1 

1 

0 

0 

6 

■  2 

0 

0 

0 

AB 

1 

1 

0 

1 

0 

1 

0 

3 

0 

absl 

1 

1 

0 

2 

l 

1 

-3 

-6 

0 

ABSq 

1 

1 

1 

0 

T 

-2 

-3' 

0 

0 

ABRl 

1 

1 

1 

1 

0 

1 

3 

0 

3 

abrlsl 

1 

1 

1 

2 

0 

1 

6 

-I 

-9 

abrlsq 

1 

1 

2 

0 

o 

1 

-3 

6 

0 

ABRq 

1 

1 

2 

1 

l 

1 

-3 

-9 
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ABRqSl 

1 

1 

2 

2 

0 

-2 

-3 

0 

9 

ABRqSq 

—  Ktagwv* 
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Hence,  the  generating  set  of  entangling  contrasts  is  as  follows: 

rlSl  e  -1  ,  RlSq  =  3  *  rqSl  =  *3  ’  RQSQ  =  3  '  BRLSE  =  “3  ’ 

BRlSQ  »  -3  ,  BRqSl  =  3  ,  BRqSq  =  -9  ,  ARLSL  =  *3  *  ARLSQ  =  ‘3  ’ 

ARqSl  =  3  ,  ARqSq  =  -9  ,  ABrlSl  =  3  ,  ABRLSQ  =  -9  , 


ABRqSl  -  9  ,  ABRqSq  9  . 

Thus,  we  find  that  there  are  even  two  letter  entangling  contrasts,  such 
a8  RlSl  ,  in  this  design.  One  could  now  proceed  to  write  down  the 

normal  matrix  for  any  desired  set  of  effect  estimates  based  on  these 
twelve  runs. 


If  2n3m  is  a  relatively  large  number  so  as  to  make  the  foregoing 
procedure  unwieldy,  a  variation  of  the  above  may  be  more  suitable, 
provided  one  can  identify  a  set  of  "live"  factors  in  the  fractional 
factorial,  i.e.,  factors  which,  when  the  remaining  factors  are  suppressed 
in  the  design,  form  a  full  factorial.  Thus,  in  the  fractional  2  3  in 
twelve  runs  presented  above,  factors  A,  B  and  R  may  be  considered 
"live".  The  run  numbers  are  already  in  standard  order  for  the  fu  i 
factorial  on  A,  B  and  R  as  they  are  presented.  Consider  then  that  we 
are  dealing  with  a  iull  2*3l  design.  Now,  instead  of  a  column  of  ones 
and  zeros,  enter  in  column  zero  next  to  each  treatment  combination  the 
X  corresponding  to  the  run.  Proceed  with  algorithm  I  for  this 

dummy  response  column  for  a  2*3*  design.  Then  identify  the  last  column 
with  the  effects  in  the  223*  on  A.  B  and  R.  Note  that  we  are  actually 


calculating 


12 

2 

i=l 


X_  .X  . 
ST  l  ai 

1j 


for  all  effects  a  involving  A,  B  and  R  as  components.  Then  non-zero 
elements  in  the  last  column  of  the  algorithm  will  indicate  which  generat¬ 
ing  entangling  contrasts  in  this  design  involve  SL  as  a  component  of 
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the  subscript.  Similarly,  by  taking  as  the  zero  column  (Xg  j . Xg  * 

one  can  find  those  entangling  contrasts  in  the  generating  set  which  involve 
SQ  as  a  component  of  the  subscript.  Clearly,  every  entangling  contrast 

will  involve  either  ST  or  S _  or  both  as  a  component  of  the  subscript, 

since  A,  B  and  R  form  a  "live"  full  factorial  and  hence,  no  entangling 
contrast  can  exist  solely  involving  A,  B  and  R.  We  have  then  found  the 
entire  generating  set  of  entangling  contrasts  in  this  plan  by  two  applica¬ 
tions  of  algorithm  I,  each  individually  smaller  in  size  than  the  application 
of  algorithm  I  presented  earlier.  The  above  variation  is  demonstrated 
below: 


"Live" 

Factors 

Suppressed 

Factors 

SL 

i 

n 

III 

Contrasts  involving 
A,  B  and  R 

A" 

B 

R 

S 

0 

0 

0 

0 

-1 

0 

0 

Total 

0 

0 

1 

1 

0 

0 

-1 

rl 

0 

0 

2 

2 

_1 

H 

i 

-3 

rq 

0 

1 

0 

1 

-2 

0 

B 

0 

1 

1 

? 

1 

2 

-3 

-3 

brl 

0 

1 

2 

0 

_-l 

_-l 

0 

3 

brq 

1 

0 

0 

1 

0 

-1 

0 

0 

A 

1 

0 

1 

2 

1 

_-l 

0 

-3 

ART 

Li 

1 

0 

2 

0 

-J. 

0 

-3 

3 

arq 

1 

1 

0 

2 

1 

-3 

0 

0 

AB 

1 

1 

1 

0 

-1 

-3 

■  3 

3 

abrl 

1 

1 

2 

1 

1 Q 

3 

6 

9 

ABRQ 

Thus,  we  find  the  subset  of  the  generating  set  of  entangling  contrasts 
involving  to  be: 
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R.  S.  =  -1  ,  RrtS_  =  -3  ,  BR.  ST  =  -3  ,  BR^S,  =  3,  AR,  ST  =  -3  , 

Xj  Lj  U  L  L  L  U  L  Li  Lj 


ARQSL  =  3  ,  ABRlSl  =  3 


ABRqS^  *  9  .  This  checks  with  our  previous 


calculation  of  the  generating  set  of  entangling  contrasts.  A  similar  com¬ 


putation  for  Sq  would  complete  the  generating  set  of  entangling  contrasts 
for  this  design? 


3-1 

9-  YATES'  3  DESIGNS.  One  specific  investigation  of  the 
entangling  of  single  degree  of  freedom  effects  in  a  fractional  2n3m 
deserves  mention.  In  [7]  ,  Yates  presented  twelve  distinct  designs, 

illustrated  below  with  R,  S  and  T  representing  the  three  factors  and  W, 
X,  Y  and  Z  being  Yates'  own  notation: 


The  following  generating  sets  of  entangling  contrasts  have  been  found 
for  the  twelve  different  3^"^  designs; 


*1  '•  RLSLTI.  *  -3  '  RlVl  *  -3  '  Wl  -  3  ■  WL  ■  -9  ’ 

rlsltq  ■  3  •  rlsqtq  =  •’ •  rqsltq  =  9  '  rqsqtq  =  9  ■ 


a  c  i  i 

r\f  F  vna»<nSo«to 
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W2: 

Vltl  3  3  ’ 

rlsqtl  3  -3  * 

rqsltl  3  3  * 

Wl  *  9  ■ 

rlsltq *  3 ■ 

rlsqtq  ■  9  • 

rqsltq  3  -9  • 

Wa  =  9  • 

W3: 

rlsqtl  =  6  * 

rqsltl  =  '6  ' 

rlsltq ■ -6  ■ 

RqSqTq  *  -18  • 

Xl: 

rlsltl  =  -3  ‘ 

rlsqtl  3  3  • 

Vltl=-3  * 

Wl  3  -9  - 

rlsltq  3  3  ' 

rlsqtq  =  9  ’ 

rqsltq  =  -9  • 

rqsqtq  3  9  • 

V 

rlsqtl  =  -6  ’ 

VltL  3  6  • 

rlsltq  3  -6  - 

Vqtq  3  -18  • 

X3  = 

rlsltl  =  3  * 

rlsqtl  3  3 • 

rqsltl  ■  ‘3  - 

rqsqtl  3  9  * 

rlsltq  *  3 ■ 

rlsqtq  *  •’  ■ 

rqsltq  3  9  - 

Wo  3  9  • 

V 

R.  ST  Tt  =  -3  , 

L  L  L 

rlsqtl  =  3  * 

rqsltl  3  3  * 

Wl  ■  9  ■ 

Vltq  3  -3  ■ 

Vqto  ■  -9  • 

Vltq ■  •»  • 

Wa  ■  9  • 

V 

rlsqtl  ■  -6  • 

Vltl  ■  -6  • 

rlsltq ■  6 ■ 

Wa  ■  -18  • 

V 

rlsltl  3  3 ' 

rlsqtl  3  3  - 

rqsltl  3  3  • 

Wl  ■  •’  • 

rlsltq  -  -3  • 

rLsqtq  3  9  • 

Vi/q  =  9  * 

Wa  ’  9  • 

V 

rlsqtl  1  6  ■ 

Vltl  -  6  • 

rlsltq  ■  6  • 

rqsqtq  3  -18  • 

V 

rlsltl  3  -3  • 

RLSQTI,  =  -3  ’ 

Vltlc-3' 

rqsqtl  3  9  ’ 

rlsltq  =  -3  * 

rlsqtq  3  9  ' 

rqsltq  3  9  ■ 

rqsqtq  =  9  • 

V 

rlsltl‘  3’ 

rlsqtl  3  -3  ’ 

Vltl=-3’ 

Wl  ■  • 

rlsltq  ’3  1 

rlsqtq  3  -9  • 

rosltq  ■  -9 • 

Wo  =  9  • 
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The  main  thing  to  observe  is  that  there  are  four  designs  which  have 
only  four  entangling  contrasts  in  the  generating  set  and  that  there  are 
eight  designs  containing  eight  entangling  contrasts  in  the  generating  set. 
Thus,  the  twelve  designs  are  by  no  means  equal  in  their  degrees  or 
patterns  of  entanglement  for  the  particular  model  we  are  assuming.  Note, 
however,  all  entangling  contrasts  involve  three-letter  words. 


For  example,  suppose  we  are  interested  in  estimating  R  ,  R  , 

L.  Q 

SL,  SQ  ,  T^  ,  Tq  ,  R^S^  an<*  ’  an<*  hypothesize  that  p  =  0  . 

Suppose  further  that  we  have  some  prior  estimate  of  and  that  we  are 
interested  in  considering  designs  W  ,  W  and  W  as  possible  experi- 

L  u  J 

mental  designs.  Then  the  normal  matrices  for  W  ,  W  and  W  are 
respectively: 


and 
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where  the  ordering  of  the  cross-product  terms  corresponds  to  the  order 
of  the  listing  of  the  desired  effects  above.  Two  examples  of  the  calcula¬ 
tions  required  for  the  normal  matrix  are: 

i)  (rlSl  '  Ripi)  -  Xr2  S2  -  (2/3)X  2  +  (1/2)X  Z 

L  L  L  U  L 


=  (4/9)X|jl  +(2/9)Xr^  +  (2/9)Xs^  +  (l/9)XR^g 


Q  Q 


“)  <Vl'rltl>  ■  VS.T,  *  <2/3>XS,T,  +  (^/^)Xr  g  x 

Li  Li  Lj  L  L  L  L 


Hence,  for  all  three  designs  we  find,  since  X  =9,  that 

H* 

(RlSl,  RlSl)  =  (4/9)  •  9  +  0  +  040  =  4 


Moreover,  for  and  , 


(RlSl  ,  RlTJ  =  0  +  (1/3)  -3  =  1, 


whereas  for  W. 


(RLSL  ’  RLTL)  =  °  +(l/3)  *  (_6)  =  -  2  • 


A  criterion  for  differentiating  among  a  group  of  designs  utilizing  a 
given  number  of  treatment  combinations,  none  of  which  is  completely 
orthogonal  with  respect  to  a  desired  set  of  effect  estimates  has  been 
discussed  by  Webb  [8]  .  He  proposes  that  the  design  which  has  the 
smallest  determinant  of  the  covariance  matrix  might  be  optimal.  This 
is  equivalent  to  choosing  the  design  which  maximizes  the  determinant  of 
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the  normal  matrix,  and  minimizes  the  volume  of  the  confidence  ellipsoid 
on  the  parameters  estimated  [6]  . 

The  values  of  the  determinants  of  the  normal  matrix  for  W.  ,  W 

10  6  10  6  1  & 
and  are  3  2,3  2  and  0  .  By  methods  discussed  earlier,  one 

can  easily  localize  the  linear  dependency  in  design  W ^  to  the  subset 

(X_  ,  X  ,  X  _  ,  X  ).  In  fact  it  is  easily  verified  that  for  W„, 

ivTATiv-DT  r\  <5 


X0  =  3X  _  +  3X_  _  +  X_ 

SQ  RLTL  RLSL  T( 


We  might  define  a  measure  of  the  relative  efficiency  in  general  of  a 
design  to  a  design  with  respect  to  a  particular  desired  set  of 

effect  to  be 

det  (normal  matrix  for  D.) 

- —  X  100  %  . 

det  (normal  matrix  for  T>^) 

In  our  consideration  of  W  ,  W  and  W  for  the  particular  desired  set 

1  &  <5 

of  effects,  we  would  eliminate  because  of  the. linear  dependency. 
Then,  the  efficiency  of  relative  to  Wg  is  100%  ,  so  that  they  are 
equally  desirable  according  to  our  criterion. 

10.  DETERMINING  DEFINING  CONTRASTS  IN  A  2n'p  DESIGN. 

The  2n  ^  series  of  fractionated  equal  frequency  designs  (^as  opposed, 
for  example,  to  designs  of  proportional  frequency  presented  by 
Addelman  [l]  )  deserves  special  consideration.  In  this  case,  as  we  have 
already  pointed  out,  the  concept  of  an  entangling  contrast  reduces  to 
that  of  a  defining  contrast.  Thus,  the  procedure  presented  for  finding 
the  set  of  entangling  contrasts  will  yield  the  set  of  defining  contrasts 
in  a  standard  2n_P  design.  Gorman  [5]  observed  this  fact  previously 
and  independently  of  this  work.  Solely  for  purposes  of  illustration, 
we  consider  the  following  2^*^  design: 
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■»  Factor;  ABC 

Run  1  0  0  1 

Run  2  110 

The  procedure  presented  then  is; 


A 

B 

c 

0 

I 

II 

III 

Defining  contrast 

0 

0 

0 

0 

1 

1 

2 

Total 

0 

0 

1 

0 

J. 

0 

C 

0 

1 

0 

0 

1 

0 

B 

0 

1 

1 

B 

1_ 

-1 

•  2 

BC 

1 

0 

0 

1 

-1 

0 

A 

1 

0 

1 

fll 

0 

_1 

-2 

AC 

1 

1 

0 

1 

0 

-1 

2 

AB 

1 

1 

1 

0 

-1 

-1 

0 

ABC 

Thus,  the  set  of  defining  contrasts  is; 

I  =  -BC  =  -AC  =  AB  , 

where  the  sign  of  the  defining  contrast  is  also  determined  by  the  last 
column  of  the  algorithm. 

11.  DETERMINING  THE  SET  OF  TREATMENT  COMBINATIONS  IN 
A  2n-P  DESIGN.  Frequently  one  knows  the  set  of  defining  contrasts  for 
a  chosen  &n”P  factorial  design  and  desires  to  know  which  treatment  com¬ 
binations  form  the  desired  fractional  factorial,  Begin  with  the  column  of 
effects  for  a  full  2n  design,  whore  the  dummy  effect  column  contains  a 
plus  or  minus  one  next  to  thoBe  desired  defining  contrasts  and  zeros 
elsewhere.  The  sign  of  each  one  is  determined  by  the  sign  of  the  corre¬ 
sponding  defining  contrast.  The  result  of  an  application  of  algorithm  II 
is  usually  a  set  of  fitted  values  for  the  complete  2n  design;  for  our 
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purpose,  the  non-zero  "fitted  values"  correspond  to  runs  contained  in 
the  desired  2n  ^  design.  This  procedure  is  illustrated  below. 

Suppose  1  ®  -BC  =  -AC  =  AB;  then  , 


Defining 

contrast 

0 

0  inverted 

1 

II 

III 

III  Inverted 

Treatment 

ABC 

Total 

1 

0 

1 

0 

0 

0 

0 

0 

0 

C 

0 

J. 

_-l 

0 

4 

4 

0 

0 

1 

B 

0 

-1 

-1 

2 

0 

0 

0 

1 

0 

BC 

-JL 

0 

l 

T 

0 

0 

0 

1 

1 

A 

0 

-1 

1 

-2 

0 

0 

1 

0 

0 

AC 

_-i 

0 

l 

2 

0 

0 

1 

0 

1 

AB 

i 

0 

1 

0 

4 

4 

1 

1 

0 

ABC 

0 

1 

1 

0 

0 

0 

1 

1 

1 

We  thus  find  that  the  runs  for  this  particular  fraction  are: 

ABC 
Run  1  0  0  1 
Run  2  110 

as  we  knew  to  be  the  case. 

This  procedure  can  be  justified  by  remembering  that  algorithms  I  and 
II  perform  inverse  operations.  Hence,  the  validity  of  the  above  procedure 
follows  from  the  validity  of  the  procedure  presented  in  section  10. 
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CONSTRUCTION  AND  COMPARISON  OF  NON-ORTHOGONAL 
INCOMPLETE  FACTORIAL  DESIGNS* 

Steve  Webb 

Rocketdyne,  A  Division  of  North  American  Aviation,  Inc. 

ABSTRACT.  Experience  in  industrial  consulting  indicates  that  the 
requirements  of  a  real  test  plan  often  differ  from  the  textbook  examples 
in  the  number  of  levels  of  the  factors,  the  interactions  which  can  be 
ignored,  and  the  number  of  runs  in  the  experiment. .  The  statistical 
consultant  must  either  convince  the  experimenter  to  compromise  his 
original  goals,  or  develop  an  ad  hoc  design  based  on  existing  designs 
and  the  former's  intuition. 

This  paper  is  concerned  with  methods  for  constructing  such  designs 
and  criteria  for  comparing  alternatives.  Various  construction  techniques 
are  illustrated  by  examples.  Two  criteria  are  developed,  and  a  conven¬ 
ient  computer  routine  for  evaluating  them  is  described.  Examples  of 
designs  are  given  which  were  constructed  for  actual  experimental 
situations. 

INTRODUCTION  AND  SUMMARY.  Very  often  in  industrial  research 
an  experimental  program  must  be  planned  for  which  existing  fractional 
factorial  designs  are  inadequate.  The  most  common  reasons  for  this 
inadequancy  are 

1)  the  available  designs  contain  too  many  runs, 

2)  the  factors  to  be  evaluated  in  the  experiment  do  not  all  appear 
at  the  same  numbers  of  levels,  and 

3)  the  particular  set  of  interactions  which  cannot  be  ignored  in 
the  analysis  of  the  experimental  results  does  not  appear  in 
any  of  the  published  designs. 

In  such  cases  the  consulting  statistician  may  have  a  tendency  to  try 
and  alter  the  thinking  of  the  experimenter  so  that  one  of  the  standard 
published  designs  can  be  used.  This  is,  of  course,  undesirable  from 

■■'Research  sponsored  by  the  Aerospace  Research  Laboratories,  Office 
of  Aerospace  Research  United  States  Air  Force,  under  Contract  AF33 
(615)-2818,  monitored  by  Dr.  H.  Leon  Harter. 
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the  experimenter's  point  of  view  and  increases  the  probability  that  the 
design  will  not  be  carried  out  sb  originally  planned.  As  an  alternative, 
the  statistician  is  faced  with  the  problem  of  developing  an  ad  hoc  test 
plan  which  satisfies  the  actual  objectives  and  constraints  of  the  real  situa¬ 
tion.  Using  his  intuition  supplemented  by  a  meager  amount  of  theory  he 
must  come  up  with  a  design  with  satisfactory  statistical  properties. 

CRITERIA  FOR  COMPARING  DESIGNS.  The  response  from  an 
experiment  will  be  denoted  by  the  N-component  vector  Y,  and  its  expected 
value  by  EY  =  Xp,  where  (3  is  a  p-component  vector  of  parameters. 
Generally  speaking,  a  good  design  will  have  low  parameter-estimate 
variances,  which  are  proportional  to  the  diagonal  elements  of  (X'X)-*  . 
For  a  given  experimental  situation,  that  is,  specification  of  the  number 
of  factors,  numbers  of  levels  for  each  factor,  and  the  interactions  to 
be  estimated,  a  particular  finite  set  of  designs  iB  available.  In  case 
one  of  these  designs  leads  to  the  minimization  of  the  variance  of  each 
estimate,  then  there  is  no  selection  problem.  ThiB  does  not  often  happen, 
however,  except  for  fractional  factorials  with  all  factors  at  two  levels. 

In  rare  cases  the  relative  importance  of  the  parameters  to  be  esti¬ 
mated  may  be  known  quantitatively  well  enough  in  advance  so  that  a 
realistic  criterion  can  be  established  based  on  the  variances.  This 
would  usually  take  the  form  of  a  weighted  average  of  the  variances.  Most 
often,  however,  the  relative  importance  of  estimating  the  parameters 
with  low  variances  will  depend  on  their  as  yet  unknown  values. 


A  criterion  for  selecting  the  design  often  proposed  is  the  generalized 
variance,  defined  as  the  determinant  of^(X'X)"^.  A  confidence  set  for 
the  parameters  is  the  set  for  which  (p-p)'(X'X)(p-p)  <,  K.  The  volumn 
of  this  ellipsiod  is 


pr(}p)  V  det(X  'X) 
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which  is  seen  to  be  related  to  the  design  only  through  the  determinant  of 
the  cross  -product  matrix.  It  is  convenient  to  consider  the  determinant 
in  the  form  of  an  index,  called  the  estimation  index,  defined  by 

IE  =  det(X'X)/(NPnPi=lWi)  . 

The  weights  w.  are  defined  as  follows.  Let  Z  be  the  coefficient  matrix 
associated  with  the  full  factorial;  that  is,  if  Y*  were  a  vector  of 
response  for  a  full  factorial  then  EY*  =  Z3.  (The  standard  parameteri¬ 
zation  is  such  that  Z'Z  is  a  diagonal  matrix.  )  Let  d.  represent  the  i-th 
diagonal  entry  of  Z'Z  and  let  M  represent  the  total  number  of  runs  in 

the  full  factorial.  Then  w,  =  d./M  . 

i  v 

Often  the  purpose  of  an  experiment  is  to  obtain  overall  information 
about  the  response.  In  these  cases  the  appropriate  criterion  is  based 
on  the  average  variance  of  a  fitted  response,  where  the  average  is  taken 
over  all  M  points  of  the  full  factorial.  The  average  variance  is  propor¬ 
tional  to  Sw.V,,  where  the  V^,  are  the  diagonal  elements  of  (X'X)"*. 

A  convenient  representation  is  through  the  "fitting  index" 

IF  =  p/(NSp=1wiVi)  . 

More  generally,  an  index  could  be  based  on  the  integrated  variance  of 

a  fitted  response.  Such  an  index  would  in  general  involve  off-diagonal 

elements  of  (X'X)"*,  and  would  be  difficult  to  define  in  a  way  which  is 

general  enough  for  both  quantitative  and  qualitative  factors.  Experience 

has  showed  I  to  be  a  very  useful  index. 

£ 

Consider  the  class  of  models  which  is  "complete"  in  the  sense  that 
if  any  interactions  between  a  pair  of  factors  appear  in  the  model,  then 
all  interactions  between  them  appear.  It  has  been  proved  [l]  that  for 
models  which  are  complete  in  this  sense,  the  maximum  value  of  both 
Ig,  and  1^  is  unity.  In  [2]  it  is  shown  that  the  maximum  is  achieved 
if  certain  combinations  of  levels  appear  with  equal  frequency.  An 
equivalent  criterion  is  that  the  cross-produce  matrix  X'X  is  propor¬ 
tional  to  the  cross-product  matrix  Z'Z  for  the  full  factorial.  All 
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regular  fractional  factorials  have  this  property.  If  interaction  parameters 
do  not  appear  in  complete  sets,  either  or  both  indices  may  be  greater  than 
unity. 

Thus  far  nothing  has  been  said  about  the  parameterization  used  to 
describe  the  response,  that  is,  how  P  is  defined  in  terms  of  the  expected 
responses  at  the  various  treatment  combinations,  or  equivalently,  how 
the  elements  of  the  X  matrix  are  defined.  Since  the  parameterization 
is  to  a  large  extent  arbitrary,  a  particularly  appealing  property  of  the  two 
indices  is  that  they  are  invariant  under  nonsingular  reparameterizations. 
That  is,  suppose  EY  =  X(3  =  XAa,  and  similarly  EY*  =  Z(3  =  XAa,  where 
A  is  nonsingular.  It  can  be  demonstrated  that  1^,  and  1^  are  identical 

whether  calculated  under  the  parameterization  3  or  a.  Thus,  the 
parameterization  is  immaterial  as  far  as  these  criteria  are  concerned. 

Without  the  use  of  electronic  computers,  the  computation  of  the 
indices  would  be  extremely  tedious.  A  computer  code  has  been  written 
for  routine  and  convenient  comparison  of  alternative  incomplete  factorial 
designs.  A  detailed  description  of  this  code  and  its  use  is  available  [3]  . 
Any  number  of  designs  may  be  evaluated  simultaneously  by  reading  in 
to  the  computer  the  treatment  combinations  in  each.  The  evaluation  will 
be  made  for  up  to  five  models  (specification  of  interaction  terms  to  be 
included  in  the  model).  A  number  of  options  is  available  to  the  User, 
including  changing  the  parameterization  used  for  two-,  three*,  or  four- 
level  factors,  or  changing  the  weights  used  in  computing  the  indices. 

A  Fortran  listing  is  included  in  reference  [3]  . 

METHODS  OF  CONSTRUCTION. 

1.  Exhaustive  Enumeration.  For  a  few  simple  experimental  situa¬ 
tions  it  is  feasible  to  enumerate  all  possible  designs.  The  optimum 
design  can  then  easily  be  chosen.  As  an  example,  consider  as  an 
experimental  situation  a  2^  in  5  runs  with  no  interactions.  There  are 
exactly  eleven  nonsingular  designs,  which  together  with  their  properties 
are  given  in  Table  I.  Clearly,  the  best  designs  are  the  eighth  and  ninth, 
for  which  each  variance  is  minimized. 

2.  One  Parameter  at  a  Time.  It  is  always  possible  to  construct  a 
saturated  design  (although  t  hey  are  very  inefficient)  by  allocating  one 
run  to  the  estimation  of  each  parameter.  For  example,  a  3^  x  2^  with 
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the  linear-by-linear  interaction  between  the  two  three-level  factors  is 
as  follows 


0  0  0  0 

1  0  0  0' 

2  0  0  0. 

0  10  0" 
0  2  0  0. 

2  2  0  0 

0  0  10 
0  0  0  1 


mean 

effect  of  first  factor 

effects  of  second  factor 

interaction 

effect  of  third  factor 

offect  of  fourth  factor 


where  we  have  indicated  the  parameter  estimated  from  each  run.  The 
fitting  and  estimation  indices  are  .24  and  .025.  respectively. 

3.  Correspondence.  The  theory  for  mixed  factorial  designs  is 
less  well  developed  than  that  for  designs  in  which  all  factors  appear  at 
the  same  number  of  levels.  A  useful  technique  is  to  construct  a  design 
with  all  factors  at  the  same  number  of  levels,  then  replace  some  of 
the  factors  with  ones  of  real  interest  using  a  fixed  correspondence 
between  sets  of  levels.  The  best-known  examples  of  this  technique 
are  the  proportional -frequency  designs  of  Addelxnan  [4]  .  To  demon¬ 
strate  this  approach  consider  a  Latin  Square  of  side  3. 

0  0  0  0 
0  1  1  1 
0  2  2  2 
10  12 
112  0 
12  0  1 
2  0  2  1 
2  10  2 
2  2  10 


The  last  two  factors  may  be  replaced  by  two -level  factors  by  using 
the  correspondence 
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0  -*  U 
1  -  i 
2-1 

which  results  in  the  design 

0  0  0  0 
0  111 
0  2  11 
10  11 
1110 
12  0  1 
.2011 
2  10  1 
2  2  10 

This  design  is  quite  efficient,  having  a  fitting  index  of  .  93  and  an  estima 
tion  index  of  .  79.  A  number  of  different  types  of  correspondences  is 
given  by  Addelman  in  [4]  . 

4,  Permutation-Invariant  Designs.  The  salient  property  of 
per  mutation-in  variant  designs,  defined  in  [6]  ,  is  that  estimates 
involving  factors  which  appear  at  the  same  number  of  levels  have  the 
same  variance  properties.  More  formally,  the  cross-product  matrix 
X'X  remains  unaltered  if  factors  appearing  at  the  same  number  of 
levels  are  permuted.  An  example  of  a  32x23  main  effect  design, 
for  which  Ip  *  .80  and  I£  =  .47,  is; 

0  0  10  0 
0  10  0  1 
0  2  0  1  0 
1  0  0  0  0 
11111 
12  111 
2  0  0  1  1 
2  1110 
2  2  10  1  , 

Using  a  standard  parameterization,  the  X  and  X’X  matrices  for 
this  design  are: 
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Permutation  of  factors  appearing  at  the  same  numbers  of  levels  has 
the  effect  of  permuting  rows  and  columns  of  the  submatrices  in  the 
partitioned  cross-product  matrix.  Since  the  submatrices  are  invariant, 
the  design  is  permutation-invariant. 

This  principle  has  been  used*  to  construct  a  series  of  as  yet  unpub¬ 
lished  saturated  second-order  designs  for  three-level  factors.  For  five 
factors  the  design  contains  the  treatment  combination  0  0  0  0  0,  the 
five  treatment  combinations  which  are  permutations  of  1  1  1  1  0,  the 
five  permutations  of  2  2  2  2  0,  and  the  ten  permutations  of  2  2  0  0  0. 
For  this  design  the  fitting  index  is  .  66  and  the  estimation  index  is  2.  35. 
Relative  to  the  full  factorial  but  adjusting  for  the  difference  in  the  number 
of  runs,  the  efficiency  of  the  estimate  off  the  mean  is  82%,  of  the  linear 
main  effects  is  114%,  of  the  quadratic  main  effects  if  25%,  and  of  the 
linear  by  linear  interactions  is  171%.  The  reason  that  the  linear  effects 
and  interactions  axe  so  efficient  is  that  the  points  of  the  design  tend  to 
be  concentrated  around  the  outside  of  the  hypercube. 

5.  Balancing  Levels.  A  very  useful  technique  for  constructing 
designs  is  to  start  with  an  ordinary  factorial  structure  for  the  first 
group  of  factors,  and  then  insert  the  remaihg  factors  in  such  a  way  that 
pairs  of  levels  appear  together  with  nearly  equal  frequencies.  For 
example,  the  following  two  designs  are  obtained  by  adding  another  two- 
level  factor  to  a  basic  2x3  full  factorial: 


*This  work  was  carried  out  by  R.  L.  Rechtschaffner  of  Rocketdyne's 
Statistical  Test  Design  Unit. 
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Design  1 

Design 

0  0  0 

0  0  0 

0  1  1 

0  1  1 

1  0  1 

1  0  1 

1  1  0 

1  1  1 

2  0  0 

2  0  1 

2  1  1 

2  1  0 

There  variance  properties  are  given  in  Table  II. 

EXAMPLES.  Three  ad  hoc  designs  which  have  been  used  successfully 
at  Rocketdyne  will  be  mentioned  briefly.  The  first  involved  determination 
of  char  formation  rate  in  ablative  heat-shield  material  under  simulated 
reentry  conditions.  The  testing  was  done  in  a  small  stationary  hydrogen- 
oxygen  rocket  engine.  The  experimental  variables  were  rocket  engine 
combustion  chamber  pressure,  propellant  mixture  ratio,  and  the  angle 
of  the  sample  in  the  rocket  exhaust.  The  experimental  design  chosen 
was  one  of  the  optimum  2^  designs  in  5  runs  discussed  earlier. 


Run 

Number 

Target 

Chamber 

Pressure 

(Psia) 

Target 

Mixture 

Ratio 

Inclination 

Angle 

(degreer) 

1 

170 

4 

0 

2 

250 

4 

3 

170 

16 

12* 

4 

250 

16 

0 

5 

250 

16 

12* 

Another  such  design  was  used  on  a  Signal  Corps  battery  program.  The 
experimental  work  involved  screening  4  cathode  materials,  3  solvents, 
and  4  salts.  The  design  was  constructed  by  balancing  the  levels  of  the 
second  four-level  factor  within  the  framework  of  the  12-run  3x4 
factorial. 


3x2x2  'BALANCED'  DESIGNS 
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Run 

Number 

Cathode 

: 

Solvent 

Salt 

1 

0 

0 

0 

2 

0 

1 

1 

3 

0 

2 

3 

4 

1 

0 

1 

5 

1 

1 

0 

6 

1 

2 

2 

7 

2 

0 

2 

8 

2 

1 

3 

9 

2 

2 

1 

10 

3 

0 

3 

11 

3 

1 

2 

12 

3 

2 

0 

Although  there  wae  no  justification  for  assuming  interactions  did  not 
exist,  they  could  reasonably  be  expected  to  be  less  important  than 
main  effects.  It  was  intended  that  this  experiment  be  used  to  elimi¬ 
nate  from  contention  some  of  the  candidate  materials  with  just  a  few 
tests,  so  that  later  tests  could  concentrate  on  the  better  ones.  The 
actual  decision  made  from  these  testB  was  that  none  of  the  four  cathode 
materials  was  satisfactory,  and  later  testing  should  be  directed  at 
finding  additional  materials.  If  all  interactions  had  been  considered,  48 
tests,  using  these  four  unsatisfactory  materials,  would  be  required. 

4 

The  balancing  technique  was  used  effectively  to  construct  a  3  x 
2^  design  in  27  runs  for  a  program  concerned  with  the  valuation  of  fiber- 
reinforced  plastic  laminates.  The  variables  are  as  follows: 


Variable 


Code  Levels 


Bonding  Pressure  A 

Bonding  Temperature  B 

Resin  Concentration  C 

Post-Cure  Temperature  D 

Bonding  Time  E 

Post-Cure  Time  F 

Fiber  Quality  G 


3 

3 

3 

3 

2 

2 

2 
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It  was  established  that  the  linear  interactions  AB,  AC,  BC,  BE, 
and  DF  are  expected  to  be  important.  Since  the  factor  D  does  not 
interact  with  the  other  three  three-level  factors,  the  starting  point  was 
a  1/3  replicate  of  a  34  using  as  defining  contrast  I  =  A^  B*  D  . 

3  3 

For  the  2  part  of  the  design  three  replications  of  the  2  plus  three 

3  4 

additional  points  were  used.  The  2  part  was  associated  with  the  3  part 
a  number  of  ways,  and  the  best  design  selected.  The  third  and  fourth 
designs  were  singular.  The  first,  and  best,  design  in  presently  being 
implemented. 
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Directorate  of  Medical  Research,  CRDL, 
Edgewood  Arsenal,  Maryland 


The  Directorate  of  Medical  Research,  CRDL,  Edgewood  Arsenal, 
Maryland  has  the  mission  of  investigating  the  physiological  effects  of 
certain  chemical  substances  on  both  human  and  animal  subjects.  One 
of  the  machines  used  to  measure  these  effects  is  a  physiograph.  This 
machine  which  is  commonly  used  in  hospitals  measures  temperature, 
pulse,  breathing  rate  and  both  systolic  and  diastolic  blood  pressure. 

The  common  hospital  versions  displays  the  information  only,  how¬ 
ever,  in  our  scientific  work  a  permanent  recording  was  desired  so  an 
analog  to  digital  converter  and  a  punch  paper  tape  output  was  installed 
on  a  unit  by  the  manufacturer,  Air  Shields  of  Hatboro,  Pennsylvania, 
Originally  a  flexowriter  was  used  for  the  output  device;  later,  a 
Frieden  SP-2  tape  punch  was  substituted  to  reduce  noise. 

This  machine  can  be  used  on  both  human  and  animal  subjects.  It 
was  first  used  by  our  Clinical  Division  with  humans.  It  was  shut  down 
for  some  months  when  difficulties  were  encountered  with  the  sensors 
picking  up  the  signal  from  the  subject.  Later  with  better  sensors  it 
was  put  to  use  again  this  time  with  dogs,  The  speed  of  recording  can 
be  adjusted.  So  far  we  have  run  at  a  rate  where  a  complete  set  of 
5  measurements  are  recorded, every  5  seconds.  Lower  rates ,$tre  pos¬ 
sible  and  in  many  cases  desirable  particularly  where  changes  occur 
only  gradually. 

When  the' paper  tape  is  received  by  the  computer  section  what  is 
seen  is  a  series  of  4  digit  numbers  followed  by  &  stop  code  where  every 
5th  number  is  of  the  same  kind.  The  numbers  are  first  checked  by  the 
computer  for  magnitude.  For  human's  temperature  is  assumed  to  be 
at  least  90,  pulse  50,  breathing  5,  systolic  blood  pressure  So,  and 
diastolic  20.  If  all  readings  are  at  least  as  large  as  those  above,  the 
readings  are  reduced  by  the  above  for  internal  computations.  Otherwise, 
an  error  stop  occurs.  It  is  felt  that  if  the  physiograph  ever  gets  out  of 
sequence  the  above  checks  would  bring  it  to  a  rapid  halt  since,  for 
instance,  reading  breathing  rate  for  blood  pressure  would  bring  an 
error  halt. 
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After  reeding  a  predetermined  number  of  entering,  or  from  a  signal 
on  the  input  tape,  computations  are  begun.  The  mean,  95%  confidence 
limits,  standard  error  and  coefficient  of  variations  are  computed  for 
each  of  the  5  types  of  measurements  together  with  all  ten  2  factor  correla¬ 
tion  coefficients. 

It  is  hoped  that  the  mean  values  and  their  standard  errors  will 
indicate  longer  term  effects  of  the  chemical.  For  instance,  significant 
changes  might  be  shown  to  occur  from  1  to  4  hours  after  administration, 
and  apparent  recovery  thereafter.  The  correlations  are  hoped  to  show 
up  more  subtle  changes.  For  instance  a  negative  correlation  between 
pulse  and  blood  pressure  is  considered  abnormal. 

Unfortunately  the  change  from  a  flexowriter  to  an  SP-2  punch  output 
took  longer  than  anticipated  and  to  date  we  have  only  data  from  early 
human  runs  with  inaccurate  sensors  but  no  drug  runs.  It  is  hoped  that 
dog  drug  runs  will  start  this  month.  An  output  from  a  test  run  is  shown 
as  Figure  1  to  illustrate  format. 
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AN  APPLICATION  OF  EXPERIMENTAL  DESIGN  IN  ERGONOMICS: 
HEART  RATE  as  a  FTTNr.TTDN  OF  WORK  STRESS  AND  TIME 


H.  B.  Tingey*  and  W.  H.  Kirby,  Jr. 

Ballistic  Research  Laboratories 
Aberdeen  Proving  Ground,  Maryland 


ABSTRACT;  This  presentation  concerns  the  establishment  of  a 
relationship  between  heart  rate  and  imposed  physical  workloads  for  a  given 
time  period  for  a  small  group  of  young  males.  A  hypothesis  was  developed, 
experiments  designed,  data  collected  under  controlled  conditions,  and  the 
results  analyzed  using  classical  statistical  methods.  The  results  demon¬ 
strate  that  the  underlying  functional  relationship  alters  as  the  stimulus 
changes.  In  this  case  the  alterations  may  be  defined  over  five  segments 
of  time. 


1.  INTRODUCTION.  Studies  of  changes  in  the  human  circulation 
have  been  made  from  many  points  of  view.  Physiologists  and  others  have 
long  been  interested  in  the  effects  of  physical  work  on  the  circulatory 
system.  Many  of  these  studies  have  used  heart  rate  behavior  as  an  indica¬ 
tion  of  the  circulatory  system's  capacity  to  respond  to  physical  workloads. 
Heavy,  medium  and  light  workloads  have  been  considered  under  various 
environmental  conditions  of  temperature  and  humidity. 

However,  to  the  best  of  our  knowledge,  there  has  been  no  attempt  to 
study  these  clinical  and  physical  relationships  using  more  classical 
statistical  procedures  in  association  with  pre  ■experimental  hypothesis 
formulation.  The  usual  approach  is  to  collect  large  amounts  of  data, 
tabulate  it  and/or  plot  it  on  a  graph.  Then  generalized  clinical  interpre¬ 
tations  are  made.  Occasionally,  a  statistician  is  asked  to  assist  in  doing 
something  with  the  data  following  its  collection. 

This  study  was  done  as  an  exploratory  exercise  not  only  to  investigate 
the  possibility  of  an  underlying  relationship  between  heart  rate  and  physical 
load,  but  as  a  meano  of  bringing  the  engineer,  physician,  and  statistician 
together  on  a  problem  of  common  interest.  We  wanted  to  consider  each 
other's  viewpoints  in  reference  to  a  physical-medical  problem.  There 
are  also  a  common  interest  to  employ  more  scientific  method  in  this  area 
of  research. 

#  Now  Assistant  Professor  of  Statistics  and  Computer  Science.*,.  Ug^raity"' 
of  Delaware,  Newark,  Delaware. 
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We  all  knew  that  heart  rate  would  increase  with  physical  cxeitiun  and 
decrease  following  the  cessation  of  it.  However,  we  were  interested  in 
knowing  the  more  precise  nature  of  the  rise  and  fall  for  different  degrees 
of  work  intensity.  As  simplifications  we  decided  to  hold  the  work  period 
and  environment  constant.  The  physical  workloads  were  chosen  in  this 
first,  study  fbr  convenience  and  measurability'.' 

Our  longer  range  objectives  include  the  development  of  predictive 
functions  relating  more  generalised  stress  situation  on  the  human  system 
using  this  type  of  approach.  Additional  cardiovascular  system  phenomena 
which  are  also  of  potential  interest  to  other  researchers,  clinicians,  and 
those  concerned  with  the  effects  of  various  forms  of  stress  are  being 
considered.  Such  phenomena  may  include,  among  other b,  coagulation 
factors,  measure  of  hypoxia,  and  biochemical  constituents. 

2.  METHODS. 

2,1  Scope  and  Procedure. 

The  purpose  of  this  experiment  was  to  assess  the  reaction  of  the 
human  heart  rate  to  work  stimulation,  The  conduct  of  the  experiment 
took  the  following  line. 

A  method  of  work  was  selected  which  may  be  described  as  a  form 
of  weight,  lifting.  Preliminary  trials  were  made  to  determine  a  set  of 
weights,  number  of  repetitions  and  frequency  which  could  be  accomplished 
by  the  five  involved  subjects.  It  was  decided  that  available  bar-bell  weights 
namely  21.  6  lbs.  ,  26.  6  lbs.  ,  and  31.  6  lbs,  would  be  used.  Each  bar-bell 
was  to  be  raised  from  the  chest  position  to  maximum  vertical  height  and 
lowered  with  minimum  restraint  to  the  starting  position,  This  cycle  was 
repeated  30  times  at  a  timed  (metronome)  rate  of  two  seconds  resulting 
in  approximately  one  minute  of  intensive  physical  activity,  The  subjects 
themselves  were  a  non-random  sample  of  available  personnel. 

A  brief  physical  description  of  the  five  subjects  who  were  healthy 
males  is  as  follows: 
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No. 

Age 

Weight 

Height 

1. 

35 

175 

5'-9" 

2. 

30 

230 

6'-4" 

3. 

44 

180 

5'-9" 

4. 

24 

135 

5* -7" 

5. 

25 

155 

5'-9" 

The  experiment  was  conducted  over  three  successive  days  foi  three 
successive  weeks,  Each  repetition  of  the  experiment  started  on  Sunday 
and  terminated  on  Tuesday  of  the  week.  On  each  day  the  experiment  was 
started  at  the  same  time  of  day  and  the  subjects  performed  in  the  same 
sequence.  On  the  first  day  the  21.  6-lb.  weight  was  used  with  the  26.  6-lb. 
and  31.  6-lb.  weights  used  on  the  second  and  third  dayB,  respectively. 

The  room  was  air  conditioned  and  temperature  and  humidity  were  essen¬ 
tially  constant  throughout  the  investigation. 

Five  minutes  prior  to  the  initiation  of  the  weight-lifting  exercise 
each  subject  was  seated  in  a  chair  adjacent  to  the  apparatus.  Small  patch 
electrodes  had  already  been  positioned  on  each  side  of  the  bare  chest  at 
the  mid-clavicular  line  just  above  the  lower  coital  border  for  the  contin¬ 
uous  recording  of  the  electrocardiogram.  The  recording  was  accomplished 
using  a  telemetering  apparatus  and  commenced  immediately  after  the 
subject  was  Beated.  This  first  phase  which  began  at  -300  seconds 
terminated  at  -60  seconds, 

The  subject  then  arose,  stepped  onto  the  force  platform,  moved  into 
a  predetermined  standing  position  with  the  forearms  against  the  chest, 
elbows  acutely  flexed,  and  hands  positioned  to  receive  the  bar-bell  from 
others.  At  approximately  -5  seconds  he  was  handed  the  bar-bell  and  at 
zero  secondshe  began  the  exercise,  ending  with  the  termination  of  the 
30th.  cycle  at  -160  seconds.  Others  relieved  him  of  the  bar-bell  immedi¬ 
ately  follov/ing  the  cessation  of  exercise,  The  subject  then  stepped  down 
from  the  platform  and  sat  in  a  chair  resting  for  the  remaining  540  seconds. 
Then  he  was  removed  from  the  experiment  and  the  continuous  monitoring 
of  the  electrocardiogram  ceased. 

This  seqvience  of  events  led  to  five  time  zones  to  consider  for  curve 
fitting,  namely,  (1)  a  rest  phase  with  essentially  constant  heart  rate 
(time;  -300  sec.  to  -60  sec,.);  (2)  a  preparation  phase  with  linear  increase 
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in  heart  rate  (time:  -60  sec.  to  -40  sec.  );  (3)  a  short  recovery  phase  with 
linear  decrease  in  heart  rate  (time:  -35  sec.  to  -5  sec.  );#  (4)  the  measured 
work  phase  with  linear  increase  (time:  actually  0  Bee.  to  60  sec.  but  heart 
rate  changes  occurred  between  -5  sec.  to  55  sec.  -  the  latter  is  used);  and 
(5)  the  recovery  phase  with  exponential  decrease  (time:  55  sec.  to  600  sec. ). 
As  mentioned,  heart  rate  was  recorded  continuously  (via  telemetered  ECGs) 
and  the  distance  of  each  lift  recorded  photographically.  Apparatus  and 
measurement  equipment  are  discussed  in  Section  2,  5. 

2.  2  Hypothesis. 

The  general  hypothesis  initially  considered  expressed  heart  rate  to 
be  some  function  of  workload  and  time,  Symbolically,  it  was  stated,  H.  R.  = 
f(L,T).  One  could  make  the  expression  more  explicit  by  adding  a  constant 
of  proportionality  and  giving  both  L  (measured  load)  and  T  (measured  time) 
exponentials.  Because  of  the  sequence  of  events  which  took  place,  the 
initial  hypothesis  was  modified  to  consider  the  five  time  periods  during 
which  the  individuals  were  measured.  This  led  us  to  the  following: 

Hq:  (a)  The  regression  relationship  between  a  workload  and 
heart  rate  is  given  over  each  of  the  five  segments  as  a  function  of  time. 


(1) 

H.  R.  a  k 

k  > 

0,-300  <t<  -120, 

(*) 

H.  R,  =  kA  +  pt 

kl* 

p>0,-60<t<  -40, 

(3) 

H.  R.  =  k2  +  (3^ 

k2 

>0,  p  <0,-35  <t  <  -5, 

(4) 

H.  R.  =  k3  +  (J  t 

k3> 

0,  (3  2  >0,0  <t  <55, 

(5) 

q  *f33t 

H.  R,  =  k.t  e 

4 

k4> 

0,  a>0,  P3>  0;60  ct <  600. 

Note;  The  actual  relationship  might  be  specified  by  a  single 

relationship  but  more  careful  planning  in  the  light  of  this 
this  experiment  is  required.  One  might  state  the  overall 
relationship  as; 

“''One  might  be  led  to  considering  this  interval  as  two  segments  whereaB 
our  original  hypothesis  was  that  over  a  short  interval  our  heart  rate  decrease 
could  be  considered  linear. 
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H.  R. 


(b)  The  regression  relationship  between  a  time  and  heart 
rate  is  given  as  a  function  of  load: 

H.  R.  =  o(t)  +  p(t)L  . 


Initially  the  data  are  subjected  to  the  analysis  of  variance  for  a 
three-way  layout  and  appropriate  tests  for  the  significance  of  main  effects 
(and  the  particular  intervals  over  whiclj  they  are  significant)  and  to  detect 
possible  interactions  which  may  be  pre^nt,  The  hypothesis  tests  have 
followed  the  standard  F-test  procedure  and  are  indicated  in  the  ANOVA 
(Analysis  of  Variance)  Table  III. 

2.  3  Design. 

The  basic  design  employed  for  each  replicate  of  the  experiment  is 
a  two-way  layout  using  time  and  theoretical  load  as  controlled  variables 
with  heart  rate  as  the  response  variable.  The  general  formulas  are 
given  in  Table  1. 

TABLE  I 


General  Formulas  for  Two-Way  Layout 


Source 
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Tt  is,  perhaps  more  desirable  to  analyze  the  data  over  the  three  trials 
of  the  experiment  by  introducing  another  main  effect  for  repetitions  of  the 
experiment.  Hence  the  analysis  of  variance  takes  on  the  pattern  of  a  three- 
way  layout  with  several  (say  n)  observations  per  cell.  The  general  formulas 
for  this  situation  are  presented  in  Table  II. 

The  data  from  the  experiment  are  used  according  to  the  formulas  in 
Table  II  to  calculate  the  results  given  in  Table  III.  The  error  sum  of 
squares  should  indicate  the  approximate  value  of  the  residual  error  after 
fitting  the  regression  lines  proposed  in  the  original  hypothesis.  One  may, 
as  a  matter  of  interest,  test  the  significance  of  the  mean  squares  for  main 
effects  and  interactions.  This  would  then  lead  the  investigator  to  an  analysis 
to  determine  the  regression  which  might  exist  over  each  of  the  five  intervals, 

Additionally,  results  from  the  mean  squares  fitting  for  the  fixed  time- 
load  variable  and  the  fixed  load  time  variable  are  presented  in  Tables 
IV  and  V. 

2.4  Instrumentation  and  Equipment. 

The  weignts  used  in  this  equipment  were  obtained  from  a  commercially 
available  bar-bell  (dumb  bell)  set,  the  components  of  which  were  weighed 
to  the  nearest  tenth  of  a  pound.  The  components  were  assembled  in  three 
combinations  to  give  the  test  weights  of  21.  6,  26.  6,  and  31.  6  pounds. 

The  experiment  was  conducted  on  the  surface  of  a  force -platform 
of  special  design  capable  of  making  accurate  measurements  of  forces  in 
the  three  orthogonal  axes  and  of  moments  about  these  three  axes.  While 
the  platform  impulses  were  measured,  discussion  concerning  them  are 
beyond  the  scope  of  this  presentation. 

Heart  rates  were  obtained  from  a  TELEMEDICS  Radio -Electro¬ 
cardiograph  known  commercially  as  the  RKG  100  System  which  is  composed 
of  a  receiver  Model  MCM  and  transmitter  Model  100  A.  The  associated 
electrodes,  as  mentioned  previously,  were  positioned  in  order  to  minimize 
muscular  noise  and  prevent  premature  loosening  of  them.  Very  sharp 
QRS  complexes  were  obtained.  The  e.c.  t-.  profiles  were  recorded  simul¬ 
taneously  with  impulses  from  the  platform  on  both  £  Sanborn  8-channel 
Paper  Recording  System,  Model  853-5460  and  a  Sanborn-Ampex  Magnetic 
Data  Recording  System,  Model  2007. 
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The  metronome  was  a  battery-powered  electromechanical  oscillator 
with  amplifier  and  speaker  calibrated  to  give  the  desired  frequency  (one 
pulse  per  second). 

16mm  motion  pictures  were  taken  of  each  subject  during  each  exercise. 
The  camera  was  located  in  order  to  record  the  appropriate  movements 
of  each  subject  in  association  with  fiducial  markers. 

3.  RESULTS. 

3.1  Results  and  Interpretations. 

The  computations  noted  in  Tables  1  and  II  were  carried  out.  and  are 
shown  in  Table  III. 

It  was  found  that  there  were  significant  differences  in  the  response 
due  to  different  loads.  This  was,  of  course,  a  gratifying  result  inasmuch 
as  the  increment  between  levels  of  load  was  rather  small.  The  resulting 
F-ratio  is  more  than  adequate  for  the  stated  significance  level.  This 
effect  can  be  appreciated  graphically  by  referring  to  Figure  1. 

The  next  control  variable,  time,  is  again  highly  significant  as  was  to 
be  expected,  The  significance  here,  as  well  as  the  previous  effect,  i.e. 
load,  may  well  stimulate  the  analyst  to  consider  the  functional  fit  to  the 
data  proposed  in  the  original  hypothesis. 

A  difficulty  encountered  from  the  analytical  point  of  view  occurs  when 
one  observes  that  both  the  Repetitions  by  Load  interaction  and  Load  by 
Time  interaction  are  both  significant.  Considering  the  former,  Repetitions 
by  Load,  the  explanation  here  must  come  more  from  clinical  considerations 
than  from  statistical  interpretations  alone.  While  the  entire  experiment 
was  considered  to  be  one  that  could  be  repeated,  one  can  note  that  the 
subjects  under  consideration,  although  healthy,  were  not  in  top  physical 
condition.  As  the  experimental  series  progressed,  an  improvement  (or 
degradation)  in  the  physical  condition  probably  occurred.  Techniques 
also  improved  during  the  conduct  of  the  experiment.  In  addition,  there 
were  one  or  two  minor  changes  in  apparatus  which  might  account  for  this 
effect.  Additionally,  the  subjects  were  not  isolated  from  normal  daily 
routine  before  and  during  the  experiment,  Perhaps  the  effect  of  psycholog¬ 
ical  factors  operating  through  the  autonomic  nervious  system  may  be  mere 
important  than  can  be  identified  at  this  time, 
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TABLE  III 

Analysis  of  Variance 


Source 

DF 

SS 

MS 

F-Ratio 

Significance 

Replications 

2 

2026. .58 

1013.29 

8.64 

None 

Load 

2 

7504.00 

3752.00 

31.99 

* 

Time 

'  43 

317105.39 

7374.54 

62.87 

»* 

RxL 

4 

5462.93 

1365.73 

11.67 

» 

* 

RxT 

86 

4874.66 

56.68 

0.483 

None 

LxT 

86 

15868.31 

184.52 

1.573 

88 

RxLxT 

172 

9693.96 

56.36 

0.460 

None 

Error 

1S84 

185808.71 

117.30 

Total 

1979 

548344.54 

*  Significant  at  5%  Level. 
♦♦Significant  at  1%  Level. 
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The  Load  by  Time  interaction  probably  receives  its  largest  contribution 
from  the  differences  in  heart  rate  accelerations  and  peak  values  which  occur 
over  the  working  phase.  In  comparisbn  with  the  close  similarity  of  the 
curves  of  the  other  time  segments  of  the  experimental  cycle,  this  interaction 
could  perhaps  be  avoided  in  subsequent  experimentation  by  considering 
measurements  only  over  the  working  phase.  However,  this  approach  cannot 
be  taken  until  reasonable  baselines  are  established  for  pre-  and  post-work 
periods. 

In  view  of  the  purpose  of  the  experiment  and  the  original  hypothesis 
presented,  an  attempt  is  made  to  perform  the  regression  analysis  set 
forth  under  the  null  hypothesis,  Table  IV  indicates  the  linear  regression 
functions  in  reference  to  time.  One  may  observe  that  the  residual  error 
after  fitting  closely  resembles,  on  the  average,  the  error  mean  square 
from  the  analysis  of  variance,  Table  V  indicates  the  regression  functions 
in  reference  to  load.  In  like  manner,  the  residual  error  after  fitting  resem- 
bles  the  error  mean  square  from  the  analysis  of  variance. 

Tests  of  significance  have  not  been  performed  on  the  individual  constants 
listed  in  Tables  IV  and  V  in  that  the  appearance  of  interaction  effects  does 
not  allow  the  combining  of  all  the  data  or  the  three  replicates  as  was  done 
for  these  calculations.  We  have  not  formulated  the  precise  nature  of  the 
multiple  test  procedure  implied  here.  The  basic  intent  again  was  to  develop 
an  idea  of  the  form  to  assist  in  future  designs, 

4.  DISCUSSION. 

4,1  Subtle  Observations. 

a.  Heart  rate  prior  to  leaving  the  sitting  rest  position, 

It  is  interesting  to  observe  (Figure  1)  the  resting  heart  rate 
patterns.  Fluctuations  for  a  given  subject  on  a  given  experimental  run 
were  essentially  similar  for  the  different  loads  and  repetitions,  Thus 
there  was  an  identifiable  pattern  for  each  of  the  participating  subjects. 

Ono  would  judge  that  some  of  the  fluctuation  in  general  might  be  lessened 
if  subjects  were  isolated  and  testing  singly  in  an  environment  in  which 
external  stimuli  were  essentially  nil.  Statistically,  of  course,  we  have 
treated  the  values  in  this  phase  as  constants. 
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TABLE  IV 


Table  of  Linear  Regression  Functions: 
Heart  Rate  vs.  Time 


Time 

* 

A 

A 

(Seconds) 

a 

8 

ERMS 

°ct 

°« 

•300. 

79.244444 

•  .05333333 

7.0593 

1.6639 

.25777 

•240. 

78.655555 

-  .06000000 

6.3054 

1.4862 

.23024 

•ISO. 

78.777777 

.13333335 

5.6110 

1.3225 

.20488 

•120. 

78.222222 

.28000001 

7.0641 

1.6650 

.25795 

•  60. 

86.499999 

-  .41999997 

9.7390 

2.2955 

.35562 

-  55. 

90.899999 

•  .55333331 

13.227 

3.1178 

.48300 

-  50. 

91.977777 

-  .35999996 

11.914 

2.8081 

.43503 

-  45. 

92.655555 

.00666670 

12.102 

2.8524 

.44190 

-  40. 

94.411110 

.23333336 

13.937 

3.2849 

.50890 

-  35. 

92.622222 

.26666670 

12.917 

3.0446 

.47166 

-  30. 

88.433333 

.52666668 

13.759 

3.2431 

.50242 

-  25. 

86.388888 

.15333334 

11.775 

2.7755 

.42997 

-  20. 

87.199999 

-  .07999997 

11.938 

2.8139 

.43592 

-  IS. 

87.333333 

. 10666670 

10.728 

2.5286 

.39172 

-  10. 

88.755555 

‘  .02666669 

9.5600 

2.2533 

.34908 

-  5. 

89.677777 

.40666669 

8.5388 

2.0126 

.31179 

0. 

95 .011 i 10 

. 12000002 

7.8273 

.1.8449 

.28581 

5. 

99.966666 

. 32666671 

8.1022 

1.9097 

.29585 

10. 

99.899999 

.72666669 

8.7247 

2.0564 

.31858 

IS. 

101.44444 

. 86666670 

9.4751 

2.2333 

.34598 

20. 

103.70000 

.60666668 

9.6506 

2.2747 

.35239 

25. 

105.47778 

.91333333 

11.449 

2.6985 

.41805 

30. 

106.04444  1 

1.0133333 

11.570 

2.7271 

.42248 

35. 

106.05555 

1.4733333 

11.876 

2.7993 

.43366 

40. 

108.48889 

1.5866667 

12.465 

2.9380 

.45515 

45. 

109.68889 

1.5600000 

13.738 

3.2382 

.50166 

50. 

11C. 62222 

1.8000000 

12.613 

2.9729 

.46056 

55. 

112.45555 

1.9266667 

13.713 

3.2321 

.50072 

60. 

107.64444 

2.0133334 

13.540 

3.1914 

.49441 

65. 

101.70000 

1.3133334 

10.333 

2.4355 

.37731 

70. 

99.799999 

. 66666669 

10.289 

2.4251 

.37569 

7S, 

97.411110 

.55333335 

8.1311 

1.9165 

.29691 

80. 

94.766666 

.47333335 

8,8753 

2.0919 

.32408 

85. 

93.644444 

.42666669 

8.9016 

2.0981 

.32504 

90. 

93.766666 

.43333337 

9.8854 

2.3300 

. 36096 

120. 

90.755555 

-  .10666665 

8.9511 

2.1098 

.32685 

180. 

83.622222 

.25333335 

8.1302 

1.9163 

.29687 

240. 

81.633333 

.11333336 

8.2653 

1.9482 

.30181 

300. 

80.422222 

.49333335 

8.5371 

2.0122 

.31173 

360. 

81.888888 

.25333336 

8.5314 

2.0109 

.31152 

420. 

81.866666 

.  .05333333 

6.9260 

1.6325 

.25290 

480. 

81.011110 

.07333336 

5.7840 

1.3633 

.21120 

540. 

81.033333 

.07333334 

6.5464 

1.5430 

.23904 

600. 

82.266666 

-  .10666665 

5.9303 

1.3978 

.21655 

TABLE  V 


Table  of  Regression  Functions 
Heart  Rate  vs.  Load 


Time 

Segment 

21.6  lbs. 

2S.6  lbs. 

31.6  lbs. 

I 

II 

79.7 

82  6  +  . 55At* 

79.7 

82.6  ♦  .SSAt* 

79.7 

82.6  4  , 55At* 

Ilia 

92.5  -  . 61At* 

92. 5  -  .61At* 

98.5  -  ,61At* 

Illb 

83.6  +  .ISAt* 

83.6  +  . 15At* 

83.6  ♦  . 15At* 

IV 

96.2  +  .32t»* 

96.7  ♦  ,43t** 

99.2  +  ,62t** 

V 

4,68  .0011  -.087t** 

O  l  “ 

4. 74  .0014  -. It** 

4.82^.0019.131** 

e  t  e 

e  t  e 

*  Start  At  at  zero  at  the  beginning  of  the  respective  segments  and 
increase  by  5  for  each  interval. 

**  Start  t  at  zero  at  the  beginning  of  the  respective  segments  and 
increase  by  1  for  each  S  seconds. 
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b.  Heart  rate  during  immediate  pre-work  phase. 

After  the  subjects  left  their  retting  chairs,  they  took  several 
paces  and  took  a  single  step  up  to  the  work  platform  and  assumed  a 
predetermined  work  position.  The  subject  then  remained  in  this  position 
to  await  the  signal  to  receive  the  bar-bell  and  commence  the  excercise. 

It  was  this  phase  that  caused  us  some  unexpected  concern  from  an 
analytical  point  of  view  in  that  we  lost  control  of  the  individual  in  trans¬ 
ferring  him  from  the  resting  phase  to  the  working  phase.  More  careful 
planning  should  avoid  this  problem  in  the  future.  The  same  kinds  of 
variations  mentioned  in  (a)  above  likewise  were  found  in  this  phase.  These 
were  also  treated  in  linear  fashion. 

c.  Heart  rate  during  work. 

While  it  was  expected  that  the  heart  rate  would  rise  rapidly 
with  the  sudden  onset  and  continuation  of  intensive  physical  exercise,  a 
more  precise  statement  on  how  it  would  rise  was  desired.  This,  hope¬ 
fully,  would  give  aome  insight  in  reference  to  the  possibility  of  an  under¬ 
lying  functional  relationship  between  workload  and  heart  rate  response. 

The  data  points  for  each  of  the  three  loads  for  the  five  subjects  are 
shown  graphically  in  Figures  3.  *nd  4.  These  data  were  fitted  with 
linear  regression  lines  as  shown  also  on  the  graphs.  It  is  interesting 
to  look  now  at  the  individual  pattern  for  this  phase  of  the  experiment. 
Figures  5,  6,  and  7_  show  their  characteristics.  To  us  these  were  very 
interesting  observations  for  they  provided  additional  insight  on  the 
manner  that  individuals  respond  to  &  physical  stress  in  a  physiological 
way  using  a  set  of  quantitative  measures  as  opposed  to  the  more  common 
but  less  rigorous  clinical  impressions.  However,  we  are  mindful  of 
the  exploratory  nature  of  this  project  as  well  as  its  being  a  small  non- 
random  Bample. 

d.  Heart  rate  during  recovery. 

It  is  very  interesting  that  heart  rate  falls  so  rapidly  following 
the  cessation  of  physical  work,  This  well  known  exponential  fall,  the 
greater  part  of  which  takes  place  within  approximately  the  first  10  to  15 
seconds,  was  demonstrated  in  association  with  the  raw  data  points  for 
the  various  loads  shown  in  Figures  8,  and  10,  According  to  the 
results  in  this  study  heart  rate  began  to  fall  several  seconds  prior  to 
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the  cessation  of  the  work.  Our  explanation  for  this  is  that  it  may  in  part 
be  anticipation  by  the  individual  toward  the  end  of  the  work  cycle  and,  in 
part,  due  to  the  method  of  discretizing  the  data.  Since  the  participants  in 
this  investigation  were  considered  to  be  clinically  healthy  males  in  a  some¬ 
what  restricted  age  range,  no  inferences  are  made  regarding  variations 
in  the  return  of  individual  heart  rates  to  the  normal  or  resting  baselines. 

The  fitted  exponential  regression  curves  shown  in  the  figures  mentioned 
above  are  treated  statistically.  Variations  are  attributed  to  circulatory 
system  characteristics  and  their  nervous  system  interactions.  Presumably 
external  stimuli  which  may  influence  heart  rates  in  the  resting  and  the 
final  stages  of  recovery  would  be  less  significant  during  intensive 
physiological  stress  derived  from  physical  work. 

e.  Heart  rate  over  the  experimental  cycle. 

A  summary  or  heart  rate  profile  over  all  phases  of  the  experi¬ 
mental  cycle  averaged  for  each  load  is  recalled  as  shown  in  Figure  1. 

It  is  interesting  to  observe  the  slopes  of  the  surves  showing  heart  rate 
increase  in  that  they  are  clearly  different  even  for  the  small  increments 
of  work  intensity.  The  same  may  be  said  in  reference  to  the  peak  values. 

4.  2  Direction  of  Subsequent  Investigation. 

In  brief  the  following  are  being  considered  for  subsequent  investiga¬ 
tions: 


a.  Longer  work  periods  in  order  to  understand  more  about  heart 
rate  behavior  at  maximum  range  under  prolonged  work  stress. 

b.  The  utilization  of  an  open  system  in  order  to  accomplish  a.  above. 
Weight  lifting,  unlike  bicycling  or  tasks  utilizing  more  of  the  muscles  tends 
to  generate  exhaustion  prior  to  the  onset  of  peak  heart  rate. 

c.  Certain  biochemical  parameters  associated  with  circulatory 
system  response  to  work  stress  may  be  useful  particularly  as  it  may,  in 
turn,  be  related  to  such  medical  conditions  as  shock,  Here  then  we  become 
concerned  with  multivariate  models  and  analysis. 

d.  Planning  of  experiments  for  additional  insight  on  roles  of  other 
physical  and  psychological  factors, 

e.  An  ultimate  objective  is  to  relate  "stress"  to  cardiovuscular 
system  changes  associated  with  early  signs  of  cardiovascular  deterioration 
and  a  particular  condition  known  as  hemorrhagic  shock. 
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STRATEGY  FOR  THE  OPTIMAL  USE  OF  WEAPONS 
BY  AREA  COVERAGE* 


J.  A.  Nickel,  J,  D.  Palmer,  and  F.  J.  Kern 
University  of  Oklahoma,  Norman,  Oklahoma 
(Representing  the  U  S,  Army  Edgewood  Arsenal) 


ABSTRACT.  The  development  of  non-nuclear  ground-based  weapons 
systems  in  a  historical  perspective  is  briefly  reviewed.  The  implications 
of  this  development  to  target  acquisition  and  logistics  in  terms  of  efficiency 
of  coverage  are  included. 

By  defining  a  new  concept  termed  Efficiency  of  Target  Destruction 
as  the  ratio  of  expected  area  destruction  of  a  target  complex  to  the  maxi¬ 
mum  theoretical  area  destruction  possible,  the  authors  have  demonstrated 
that  a  delivery  of  a  number  of  small  effects  patterns  can  be  most  efficient. 
Through  use  of  the  SADI  Mark  IV,  Statistical  Additive  Density  Integrator, 
it  was  found  that  the  delivery  error  (standard  deviation  of  delivery)  must 
be  in  the  neighborhood  of  50%  of  the  target  radius  for  maximum  efficiency. 
It  was  further  found  that  the  efficiency  is  not  appreciably  reduced  if  the 
actual  aim  point  is  within  30%  of  the  target's  radius  of  the  center. 

These  results  clearly  indicate  that  for  certain  classes  of  targets  a 
decided  advantage  is  attained  in  terms  of  efficiency  of  the  weapons  system, 
reduction  of  target  locator  accuracy  requirements,  and  a  lessening  of 
the  impact  of  logistics  support. 

INTRODUCTION.  In  the  generations  of  ground  based  weapons  systems 
since  World  War  II,  three  readily  identifiable  stages  have  existed  in  the 
development  of  non-nuclear  weapons.  In  the  first  instance,  the  attempt 
was  to  develop  a  warhead  with  the  greatest  possible  damage  or  effects 
pattern  which  required  larger  and  larger  lethal  radii  for  each  particular 
system.  During  this  initial  phase,  it  was  tacitly  assumed  that  if  one 
could  develop  a  larger  effects  pattern,  this  was  most  easily  delivered 
on  target.  A  major  effort  during  this  period  was  toward  warhead  design 
and  development,  with  little  effort  toward  determining  the  accuracy 
requirements:.  It  was  further  assumed  that  once  the  warhead  was  avail¬ 
able,  delivery  on  target  would  be  readily  achieved. 


’I'This  acticle  appeared  earlier  as  a  University  of  Oklahoma  Research 
Institute  Technical  Report:  Contract  DA  18-035-AMC -116(A);  Internal 
Memorandi.'.n  1454-1-2,  July  1955, 
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The  second  phase  initiates  with  the  realization  that  the  warhead  could 
not  bf  delivered  on  target  with  a  high  degree  of  reliability,  that  is  with  a 
high  assurance  level.  This  led  into  the  second  phase  which  was  the  develop¬ 
ment  of  more  accurate  locators  and  target  acquisition  devices  for 
sophisticated  target  acquisition- -techniques  such  as  infrared,  radar, 
radiometers,  acoustics,  etc.  In  this  phase  the  dependence  of  any  system 
on  the  inherent  ability  of  a  locator  to  not  only  locate  the  target,  but  also 
locate  itself  relative  to  the  weapon  was  realized.  This  presented  the 
third  problem  with  even  more  difficulties  for  designers  and  tacticians. 

The  situation  now  becomes  that  of  large  lethal  area  weapons  with  a  rela¬ 
tively  low  accuracy  yielding  the  resultant  of  the  net  amount  of  lethal 
pattern  placed  on  a  target  being  less  than  that  which  could  be  achieved 
should  a  highly  accurate  method  of  firing  be  developed.  With  the  realiza¬ 
tion  that  these  two  viewpoints  were  mutally  opposing,  atte  pts  have  been 
made  to  develop  closely  coordinated  systems  involving  locators  and 
weapons.  Considerable  research  has  been  performed  in  an  attempt  to 
formulate  a  methodology  v/hich  would  serve  to  alleviate  this  inherent 
difficulty. 

In  establishing  minimum  criteria  for  target  location  and  firing  patterns, 
the  objectives  have  been  aimed  at  generating  more  accurate  locating 
systems  and  larger  effects  patterns,  Target  requirements  have  become 
more  and  more  stringent.  More  potent  effects  patterns  (non-nuclear) 
have  been  developed  with  the  rather  obvious  end  result  of  requiring  greater 
locator  accuracy  to  achieve  a  maximum  effective  firepower  per  unit. 

Most  recent  studies  have  been  directed  to  ascertaining  error  sources  and 
attempting  to  provide  a  maximum  assurance  level  of  target  coverage  for 
a  given  system.  This  has  usually  resulted  in  going  to  larger  and  larger 
total  effects  patterns  as  a  consequence  of  the  inability  to  supply  more 
accurate  target  location  methods.  Tests  to  determine  the  maximum  allow¬ 
able  error  for  multiple  effects  patterns  have  resulted  in  a  promulgation 
of  this  same  trend,  Hence,  a  higher  required  assurance  value  of  target 
destruction  has  resulted  in  specifications  for  more  accuracy  in  location, 

The  problems  which  accrue  from  this  trend  are  many.  They  include 
the  requirement  for  more  accuracy  and  mobility  in  target  location  systems, 
logistic  difficulties  associated  with  increased  firing  rates,  loss  of  target 
during  "zero-in"  fire  due  to  target  mobility,  and  high  initial  and  mainte¬ 
nance  costs  associated  with  larger  more  complex  weapons  systems.  The 
results  have  been  the  generation  of  requirements  for  more  accurate  radars, 
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infrared  devices,  and  optical  detectors  with  accompanying  data  processing 
equipment,  weapons  selectors  as  well  as  more  accurate  delivery  systems. 

A  re-examination  of  simulation  data  originally  run  to  determine 
minimum  accuracy  requirements  to  yield  maximim  area  coverage  has 
resulted  in  a  number  of  factors  which  point  toward  an  entirely  different 
assessment  of  applicable  criteria  contradicting  previous  concepts,  In 
attempting  to  determine  the  "efficiency”  of  various  weapons  systems 
against  standard  target  sizes,  it  was  found  that  maximum  efficiency  seemed 
to  occur  when  use  was  made  of  smaller  values  of  R^/R^,  (lethal  patterns) 

and  that  a  deliberate  error  of  up  to  15%  R^,  had  only  a  minor  effect  on 

area  coverage  at  the  maximum  efficiency  levels  and  further  that  a  sacrifice 
in  assurance  level  could  be  made  and  yet  have  a  better  system  than  is 
presently  available  against  certain  classes  of  targets  under  the  previous 
optimization  requirements. 

Through  the  use  of  the  OURI-SADI  Mark  IV,  a  systematic  study  of 
effects  patterns  and  their  effectiveness  on  area  targets  has  been  under 
investigation,  An  analysis  of  the  data  has  brought  forward  several  observa¬ 
tions,  Foremost  among  the  observations  is  that  the  effectiveness  in  use 
of  munitions  can  be  increased  by  reducing  the  size  of  the  effects  pattern 
of  a  given  round  and  distributing  a  number  of  these  with  a  delivery  error 
that  is  bounded  a.way  from  zero,  i,  e.  ,  not  perfectly  accurate,  as  well  as 
having  an  upper  bound  on  the  weapons  errors. 

For  flame  technology,  this  is  particularly  important  since  by  not 
attempting  to  cover  the  entire  target  with  fuel,  the  insulation  effect  of 
unburned  fuels  is  minimized,  The  desirability  of  small  portable  flame 
devices  increases  since  on  the  criteria  enumerated  they  are  tactically 
sound.  Furthermore,  multiple  bursts  with  each  component  yielding  a 
small  effects  pattern,  requires  less  delivery  accuracy  than  a  single 
larger  burst  having  the  same  potential  of  destruction. 

The  SADI  Mark  IV,  Statistical  Additive  Density  Integrator,  as 
developed  by  the  personnel  of  the  Systems  Research  Center,  University 
of  Oklahoma  Research  Institute,  permits  the  evaluation  of  lethality  to  a 
target  by  simulation  techniques.  Through  such  studies,  several  factors 
influencing  the  effectiveness  of  multiple  firings  on  a  target  have  come  to 
light.  Area  coverage  affected  by  flame  devices  is  particularly  well- 
modeled  by  this  simulation  technique. 
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At  this  juncture  of  study,  the  basic  configuration  has  been  the  random 
placement  of  six  circular  "cookie -cutter "  effects  regions.  Each 
component,  with  total  destruction  or  lethality  throughout  the  circle,  is 
distributed  about  a  point  on  a  circular  target.  Circular  effects  regions 
have  been  employed  since  in  a  first  approximation,  this  is  approximately 
the  shape  experienced  under  actual  firings.  Circular  targets  have  been 
used  since  maximum  efficiency  could  be  designed  into  the  SADI  Mark  IV 
with  this  configuration.  It  is  known,  however,  that  a  topological  equiv¬ 
alence  exists  between  this  configuration  and  any  other  for  which  the 
boundaries  of  the  target  and  lethality  region  are  simple  closed  curves. 

It  should  be  further  observed  that  the  numerical  discrepancy  between 
using  circular  patterns  and  rectangular  patterns  is  negligible.  (Ewing, 
George.  Predicting  the  Effects  of  Multiple  Firing  on  an  Area  Target 
and  Related  Questions.  OCDD,  USA  AMS,  Ft,  Sill,  1955J]  implicit  in 
the  foregoing  equivalence  are  questions  of  approximated  symmetry  and 
other  regulatory  conditions  which  will  not  be  considered. 

APPROXIMATING  CONVOLUTION  OF  THE  MOMENT  GENERATING 
INTEGRAL.  In  trying  to  estimate  a  suitable  approximation  to  the 
probability  density  function  f(x)  of  a  population  from  which  samples  are 
drawn,  the  following  scheme  approximating  the  density  function  from  the 
empirical  moments  is  proposed. 

It  is  known  in  statistical  theory,  that  if  the  Moment  Generating  Func¬ 
tion,  M(x),  is  known  for  a  sampling  distribution,  then  the  moments  of 
that  distribution  are  readily  obtained  from  the  derivatives, 

For  simplicity  it  is  assumed  that  the  probability  density  f(x)  at  a 
continuous  variable  has  n  mnverwent  McLaurin  expansion  on  the  unit 
interval  (0,  1)  and  is  zero  elsewhere,  i.  e.  , 


f(x) 


*  k 

h  b,  X 
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O  <  x  <  1 


=  0 


elsewhere . 


This  assumption  permits  the  Moment  Generating  Function  M(x)  to  be 
expressed  as  an  integral  over  the  unit  interval,  i.e.  , 
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This  function  can  furthermore  be  expressed  as  a  power  series 


oo  k 

M(x)  =  1  +  S  X  \ 


k=l  k! 


th 

where  is  the  k  moment  about  the  origin. 

A  second  possible  interpretation  is  available  by  considering  M(x) 
as  an  Integral  Transform  instead  of  an  expected  value.  As  an  Integral 
Transform,  the  following  needed  properties  can  be  established. 

(1)  M(ax  +  by)  =  aM(x)  +  bM(y) 
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Using  the  assumed  power  series  expansion  lor  the  probability  density 
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When  the  transforms  of  x  are  substituted  into  this  last  expression  a 
second  power  series  expansion  is  obtained  for  the  Moment  Generating 
Function,  this  time  in  terms  of  the  McLaurin  coefficients  of  f(x).  Two 
power  series  converging  to  the  same  function  necessarily  have  identical 
coefficients.  From  this,  it  follows  that 


Vi 
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j=l 


k  +  j  -  1 


k  =  1,  2, 


where  v  s  1,  These  constitute  an  infinite  system  of  equktionB  in  the 
variables  b,_^,  j  =  1,  2,  ... 

Letting  B  denote  the  column  matrix  of  the  McLaurin  coefficients 
bj  N  the  column  matrix  of  the  moments  v ^  ^  calculated  from  the 

sample,  and  A  the  Hilbert  matrix 


The  foregoing  system  of  equations  can  be  written 

AB  =  N  . 


The  matrix  A  is  singular  and  has  no  inverse.  However,  if  the  system 
is  truncated  so  as  to  utilize  only  a  specific  number  (n)  of  moments,  the^ 
resulting  (n  4  1)  by  (n  +  1)  square  matrix  A^  does  have  an  inverse  A^ 
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The  approximating  polynomial  coefficients  can  then  readily  be  obtained  as 


B  =  A  _1N  . 
n 

It  should  be  observed  that  the  matrix  A  ,  and  hence,  its  inverse  if 

n  -1 

independent  of  the  sampling  distribution,  hence,  one  A  can  be  used 
for  all  samples  at  that  degree  of  approximation. 

From  a  casual  observation  of  the  data  it  is  apparent  that  the  density 
function  is  not  uniform,  normal,  or  even  symmetrical.  It  follows  that  any 
admissible  polynomial  approximation  should  be  by  a  polynomial  of  degree 
greater  than  two.  To  allow  for  the  possibility  of  symmetry  it  is  reasonable 
to  consider  an  approximating  polynomial  of  degree  four  (4).  If  a  least 
squares  analysis  were  to  be  employed  in  determining  the  coefficients  for 
such  a  polynomial,  it  would  be  necessary  to  use  eight  (8)  moments  of  the 
relative  areas,  Since  the  basis  for  accepting  the  polynomial  of  degree 
four  as  a  good  approximation  to  the  density  function  is  not  established, 
an  abbreviated  procedure  over  a  least  squares  evaluation  is  desired. 

A  polynomial  approximation  to  the  probability  density  function  was 
developed  through  the  use  of  an  approximating  convolution  of  the  Moment 
Generating  Integral.  If  the  approximating  density  function  is  given  by 

f(x)  =  bQ  +  b^x  +  b2x2  +  b3x3  +  b4x4  , 

where  x  is  the  relative  area  reduced  to  the  unit  interval.  The  above 
approximating  convolution  gives  the  following  formulas  for  the  coefficients. 

b  =  25  -  300v,  +  1050  v  -  1400v,  +  630V, 
o  12  3  4 

b,  =  -300  +  4800 v,  -  18900 v,  +  26880v,  -  12600v, 

1  12  3  4 

b.  ■  1050  -  I8900v,  +  79380v.  -  117600v„  +  56700v. 

2  12  3  4 

b,  =  -1400  +  26880v,  -  117600v,  +  179200v,  -  88200v, 

3  12  3  4 

b,  *  630  -  12600v,  +  56700v,  -  88200v,  +  44100V, 

4  12  3  4 
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where  v  ,  v  ,  and  v ^  are  the  moments  of  the  relative  areas  of 

coverage  about  the  origin  reduced  (or  scaled)  so  that  the  maximum  relative 
area  is  one  (1).  This  calculation  only  requires  the  use  of  four  moments 
of  the  distribution  to  give  an  approximating  polynomial.  The  only  segment 
of  the  ensuing  polynomial  used  is  that  part  lying  above  the  axis  and  corre¬ 
sponding  to  the  range  of  values  of  the  original  sampling  distribution. 

For  the  purposes  of  the  original  problem,  the  cumulative  probability 
distribution  is  needed.  This  is  readily  approximated  by  the  polynomial 

p  =  bQx  +(1/3t>ix2  +(l/3b2x3  +(l/^b3x4  +(l/5>54*5- 


obtained  from  integrating  the  approximating  density  polynomial.  This 
again  is  used  only  over  the  domain  corresponding  to  the  observed  area 
input  obtained  from  the  simulator.  As  a  statistical  control,  a  kolmogorov- 
Smirnov  Test  was  applied  to  the  empirical  distribution  and  the  calculated 
approximating  cumulative  probability  polynomial. 

SYSTEMATICALLY  INTRODUCED  BIAS.  It  has  long  been  recognized 
that  a  knowledge  of  the  exact  position  of  a  target  relative  to  the  weapon  is 
generally  not  initially  known.  This  raises  the  question  of  bias  effects  in 
the  assumed  target  location  relative  to  the  actual  target  center.  A  study 
has  been  initiated  to  investigate  the  systematic  introduction  of  bias  in  the 
location  of  ground  zero.  The  actual  procedure  used  is  probably  best 
described  through  the  use  of  the  flow  chart  of  Figure  1. 

Initially,  in  the  study  of  bias  effects,  the  parameters  considered 
have  been  <r  =  0.5  and  r  =  0.  2>[ 5.  For  this  particular  case,  it  became 
apparent  in  preliminary  investigations  that  a  bias  less  than  or  equal  to 
0.  3  of  the  target  radius  produced  minor  decrease  in  the  expected  area 
coverage.  The  fall  off  to  a  first  approximation  is  parabolic  and  the  area 
coverage  can  be  approximated  by  multiplying  the  expected  area  coverage 
of  a  symmetrical  distribution  by  the  factor. 

1  -  V  0.  925X  0  <  X  <0.4  . 


In  this  factor,  the  bias  X  is  the  ratio  of  the  distance  between  the  target 
center  and  aim  point,  and  the  target  radius.  For  X  =  0.  3,  the  fall  off 
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FIGUEE  1 

Bln*  Efface  Analyst*  on  a  Sampling  Distribution  of  a 
Composite  of  N  Effects  Patterns 


Dictionary  of  Symbols 

N  ■  number  of  effects  patterns  (component)  In  compoalta 

a  "  standard  deviation  of  delivery 

r  ”  relative  radius  of  effects  pattern  (component) 

n  ■  total  number  of  composite  patterns  to  be  considered 

s  ■  total  number  of  shifts  to  be  considered  (bias  effects  In  terms 
of  1/S  target  radii) 

k  ■  total  number  of  rays  to  be  considered 


PERCENT  DESTRUCTION  RELATIVE 
TO  DESTRUCTION  WITH  NO  OFFSET 
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FIGURE  8 


OFFSET-UNITS 
I  OFFSET  UNIT  ■  0.  2  RT 

DECREMENT  IN  DESTRUCTION  AS 
A  FUNCTION  OF  OFFSET 
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FIGURE  10 
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OFFSET  -UNITS 
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AREA  COVERAGE  AS  A  FUNCTION  OF 
DISPLACEMENT  OF  TARGET  AIM  POINT 
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is  approximately  15%,  and  hence  for  smaller  bias,  the  correction  is  quite 
insignificant.  It  must  be  again  pointed  out  that  the  foregoing  correction 
is  based  upon  the  observations  of  one  pair  of  parameters,  A  larger  set 
of  parameters  must  be  considered  before  general  conclusions  can  be 
drawn  with  a  high  degree  of  certainty. 

One  conclusion  inferred  from  the  foregoing  observation  is  that  for 
multiple  firings  on  an  area  target,  the  accompanying  target  acquisition 
problem  is  not  of  major  significance,  since  minor  inaccuracies  in  the 
target  location  will  not  significantly  affect  the  expected  amount  of  destruc- 
tion  when  all  rounds  are  aimed  at  what  is  considered  to  be  the  target 
center. 

SYMMETRICAL  PROBLEM.  The  first  study  to  be  considered 
consisted  of  six  rounds  being  aimed  at  the  center  of  a  circular  target 
and  distributed  with  a  circular  normal  probability  distribution  about  the 
aim  point.  Standard  deviations  equal  to  one-half  and  three -fourths  of  the 
target  radius  were  used  with  a  larger  variety  of  effects  circles.  (Nickel, 
J.  A.  ,  Palmer,  J.  D.  ,  Battlefield  Simulation  for  First  Round  Accuracy 
Requirements  of  Simultaneous  Multiple  Firings.  Proceedings  of  Winter 
Convention  on  Military  Electronics,  IRE,  1963;  Nickel,  J.  A.,  Palmer, 

J,  D.  ,  Gajjar,  J.  T.  ,  Kern,  F.  J.  ,  and  Williams,  D.  R.  Battlefield 
Simulation  for  First  Round  Accuracy  Requirements  of  Simultaneous 
Multiple  Firings.  Data  Supplement  No.  1.  DA  34-031 -AIV-679,  1107-5-6, 
January  8,  1963. )  ” 

In  all  cases  considered,  it  was  observed  that  the  smaller  standard 
deviation  consistently  yielded  a  greater  statistical  area  coverage.  In 
other  words,  for  a  given  size  of  the  component  effects  circle,  the 
standard  deviation  of  one-half  the  target  radius  gave  a  greater  area 
coverage  than  did  the  larger  standard  deviation  of  three -fourths  the 
target  radius.  A  local  minimum  area  coverage  is  to  be  had  with  a 
standard  deviation  of  zero,  in  which  case  all  effects  components  would 
lie  on  top  of  each  other,  giving  a  total  effective  area  equivalent  to  that 
produced  by  a  single  component. 

Consider  the  statistical  area  coverage  as  a  function  of  the  standard 
deviation  of  delivery,  <r  ,  as  well  as  the  radius  of  the  effects  circle 
component,  r.  Notationally,  this  will  be  Written  as  A(<r,  r).  From  the 
remarks  of  the  preceding  paragraph  one  observes  that 
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A(0,  r)  <  A(0.  5,  r)  r  <  1 

A(0.  75,  r)  <  A(0.  5,  r)  . 


Since  <r  can  be  varied  continuously,  it  is  reasonable  to  surmise  that 
the  area  function  also  varies  continuously.  One  now  concludes  from  the 
mean  value  theorem  of  differential  calculus  that  the  area  function  achieves 
an  expected  maximum  value  for  some  < r  in  the  neighborhood  of  <r  =  0.  5 
for  each  value  of  r.  The  associated  values  are  yet  to  be  approximated 
by  simulation  studies,  A  fundamental  conclusion  to  be  drawn  from  these 
observations  is  that  for  a  given  size  of  effects  components,  there  is  a 
critical  value  for  the  standard  deviation  of  delivery  which  will  yield 
a  maximum  area  coverage  at  a  given  statistical  level,  when  the  aim 
point  is  the  target  center.  Figure  2  illustrates  this  fact  by  exhibiting 
a  random  delivery  pattern,  delivered  with  standard  deviation  of 
(a)  <r  *  0.  5  and  (b)  <r  =  1.  0.  In  Figure  (2a)  there  is  considerable  over¬ 
lapping  of  lethality  components,  In  Figure  (2b)  two  of  the  lethality 
components  are  so  far  removed  from  the  target  center  that  no  damage 
to  the  target  is  experienced  by  them  and  are  not  recorded  in  the  figure, 

A  second  observation  based  upon  this  modeling  is  that  for  a  given 
number  of  effects  components,  the  total  effects  or  area  coverage  as 
measured  on  the  simulator,  increases  with  an  increase  in  r,  the  radius 
of  the  component  circles.  However,  an  increase  in  r  is  accompanied 
with  an  increase  in  the  areas  shared  by  two  or  more  components.  This 
effect  is  illustrated  in  Figures  3a  and  3b.  These  figures  are  composites 
showing  the  effect  of  distributing  six  rounds  (lethal  components)  about 
the  center  of  the  target  with  <r  =  0.5.  The  shaded  set  of  circles  cor¬ 
respond  to  an  r  =  0.  25  whereas  the  larger  boundary  about  these  shaded 
circles  correspond  to  r  =  0.  5. 

From  a  tactical  point  of  view,  a  weapon  is  most  effective  if  it  deploys 
to  a  given  target  only  the  minimum  quantity  of  casualty  producing  material. 
Using  this  as  a  basis,  it  is  proposed  that  an  index  of  efficiency  E,  can 
be  determined  by 

E  -  ExPected  Target  Area  Coverage 
Theoretical  Area  Coverage 
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The  Theoretical  Area  Coverage  in  the  formula  is  defined  as  the  total  area 
that  could  be  covered  by  the  casualty  producing  material  if  it  were  dis¬ 
tributed  uniformly.  This  value,  the  Theoretical  Area  Coverage,  may 
exceed  the  total  area  of  the  target.  If  A^  is  the  effects  area  produced 
by  one  component,  then  NA^  is  the  Theoretical  Area  Coverage,  N 
being  the  number  of  effects  areas  used. 

Defining  efficiency  as  above,  it  is  readily  observable  its  efficiency 
decreases  with  an  increase  in  the  size  of  the  effects  components  for  a 
given  number  of  components.  The  dependence  on  the  standard  deviation 
of  the  distribution  again  enters  in  with  an  apparent  maximum  again  some¬ 
where  near  <r  =  0,  5.  This  criteria  of  efficiency  no  longer  demands 
extreme  accuracy.  The  classical  phrase  "don't  shoot  until  you  see  the 
whites  of  their  eyes,  11  is  in  general  not  applicable,  excepting  certain 
limited  tactical  situations, 

At  all  levels  of  significance  and  for  all  tr  ,  the  efficiency  curves 
appear  to  approach  each  other  asymptotically  for  large  component  radii, 
r.  Further,  it  should  be  observed  that  the  limit  value  for  the  efficiency 
index  on  increasing  r  is  0.  Figures  4  and  5  indicate  the  efficiencies 
for  9  =  0.  5  and  tr  =  0.75  respectively.  From  these  curves,  the  follow¬ 
ing  qualitative  information  is  evident.  First,  the  efficiency  increases 
with  a  decrease  in  the  level  of  assurance  demanded.  For  small  r  and 
large  v  there  seems  to  be  an  achievable  maximum  of  efficiency  obtain¬ 
able.  This  may,  however,  be  an  apparent  condition  peculiar  to  the 
particular  sample  set  used.  Further  investigation  is  needed  on  this  point, 
A  flow  chart  describing  the  proposed  investigation  on  the  SADI  Mark  IV 
is  found  in  Figure  6, 

A  conjecture  resulting  from  these  observations  is  that  the  efficiency 
can  be  increased  by  reducing  the  size  of  the  effects  circles  to  a  critical 
size  dependent  upon  the  target  size  and  the  standard  deviation  of  delivery 
<r  ,  This  implies  an  entirely  new  concept  for  matched  weapons  systems. 

LOGISTICAL  IMPLICATIONS,  For  tactical  neutralization  of 
destruction  of  area  targets,  a  number  of  tentative  conclusions  can  be 
formulated  in  light  of  the  foregoing  observations,  If  it  is  desirable  to 
strike  the  target  without  forewarning,  several  features  need  to  be 
considered.  A  single  round  could  be  used,  but  in  such  situations  the 
effects  of  bias  and  delivery  errors  play  a  major  role  (Nickel,  J,  A.  , 
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Palmer,  J.  D.  Nomograph  for  the  Determination  of  the  Common  A icz. 
Intersection  of  Two  Distributed  Circles,  DA  34-031-AIV-679,  1107-5-9; 
March,  1964.)  in  the  relative  effectiveness  of  that  round,  that  is,  target 
location  problems  are  of  paramount  importance. 

As  an  alternate  approach,  multiple  rounds,  each  with  a  smaller  effect 
pattern  can  be  employed.  In  such  a  deployment  of  munitions,  the  effects 
of  bias  (if  not  too  great)  are  minimized,  Furthermore,  the  efficiency  of 
effective  area  coverage  can  be  increased  for  a  suitable  matching  of  effects 
radius  and  standard  deviation  of  delivery  to  the  target  radius. 

An  immediate  implication  is  that  an  effective  weapons  system  to  be 
employed  against  area  targetB  for  which  protective  procedures  can  be 
affected,  such  as  mobile  targets,  personnel,  etc.  ,  are  those  which  can 
deliver  a  number  of  rounds,  each  with  a  small  effects  radius.  The 
effectiveness  of  the  system  is  optimized  and  does  not  require  excessive 
accuracy.  Such  weapons  presumably  would  include  small  caliber  cannons, 
rocket  launchers,  and  mortars,  with  a  variety  of  warheads  from  HE  to 
flame  and  other  incendiary  devices.  Stated  another  way,  minor  inaccu¬ 
racies  in  the  location  of  a  target  under  the  fire  of  a  volley  will  not 
significantly  affect  the  expected  amount  of  destruction. 

Unloading  a  volley  on  a  target  before  protective  measures  can  be 
undertaken,  may  be  tactically  more  efficient  than  attempting  to  zero  in 
on  a  target  by  successive  firings  and  corrections.  Not  only  does  the 
zeroing  give  forewarning,  but  increased  accuracy  in  the  knowledge  of 
target  location  is  not  found,  As  an  illustration  consider  the  problem  of 
using  two  rounds  to  bracket  the  target  and  the  third  for  effect.  First, 
in  order  to  assure  equivalent  ballistic  trajectories,  missiles  of  the  same 
size  and  mass  must  be  used,  and  hence  two  rounds  are  wasted,  To 
further  assess  the  consequences  of  bracket  firing,  suppose  the  first 
round  is  fired  short  (deliberately),  by  an  amount  S.  Due  to  errors 
inherent  to  the  system,  the  round  lands  at  P2  instead  of  P^.  For 

symmetrical  bracketing,  Figure  7,  the  second  round  is  aimed  at  P^, 

a  point  symmetrically  located  with  respect  to  the  target  point  T,  If 

P.  had  coordinates  (0,  -s),  then  P_  has  coordinates  (-x,  -y-s)  where  x 

«  • 

and  y  are  distributed  by  the  appropriate  error  ellipse  of  the  weapon. 

The  intended  coordinates  of  P^  are  (x,  y  +  s),  but  in  actuality  the  aim 
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point  is  at  with  coordinates  (2x,  2y  +  s).  The  second  round  actually 

aimed  at  P  .  lands  at  a  point  P  with  coordinates  (2x  +  x',  2y  +  y'  +  a) 

4  5 

when  x'  and  y1  are  again  distributed  according  to  the  error  ellipse  of  the 
weapon.  From  the  coordinates  of  and  P&,  the  observed  burst  points, 

corrections  are  calculated  for  determining  the  target  location  T .  The 
correction  is  applied  either  to  the  coordinate  or  where  in 

actuality  it  should  have  been  and  actually  is  applied  to  the  unknown 
coordinates  of  point  P1  and  P^.  If  the  correction  is  applied  to  P the 

aim  point,  Z  ,  has  coordinates  (x,  y),  whereas  if  the  correction  is 
applied  to  P^,  the  aim  point,  Z^,  has  coordinates  (x1,  y').  This  is 

interpreted  to  being  equivalent  that  the  weapon  can  now  "know"  the  loca¬ 
tion  of  the  target  to  within  the  probability  distribution  of  the  weapon 
under  bracket  fire  techniques,  and  hence  in  firing  for  effect,  the  loca¬ 
tion  of  the  third  round  be  distributed  about  the  target  with  a  probability 
distribution  with  twice  the  variance  of  the  weapon. 

It  should  be  observed  that  successive  rounds  fired  at  the  target 
without  further  correction  will  be  distributed  about  an  aim  point  offset 
from  the  target  in  some  direction.  This  offset  is  an  example  of  the 
unknown  bias  in  the  delivery  of  munitions.  This  further  emphasizes 
the  desirability  of  multiple  small  round  firings  in  order  to  take  advantage 
of  insensitivity  to  bias  in  delivery  effectiveness. 

Manpower  requirements  and  other  aspects  of  logistical  support, 
point  to  other  desirable  features  of  such  systems,  The  importance  of 
such  consideration  has  been  noted  many  times  and  has  been  particularly 
well-Btated  by  Marshak  and  Mickey  (Rand  Corporation)  when  commenting 
on  the  optimal  choice  for  weapons  when  they  said, 

"We  want  to  choose  a  weapon  system  that,  subject 
to  a  given  cost  constraint,  will  maximize  the 
mathematical  expectation  of  a  military  utility 
(probability  of  victory).  " 

The  foregoing  model  is  based  on  the  correlation  of  probability  of  victory 
to  the  target  area  coverage.  Some  further  comments  on  the  nature  of 
cost  constraints  have  been  briefly  considered  by  Nickel  and  Palmer 
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APPENDIX  A 

Data  for  studying  the  bias  effects  (lack  of  knowledge  concerning  target 
center)  on  a  sampling  distribution  of  a  composlt  of  N-effects  patterns  have 
been  taken  on  the  SADI  Mark  IV.  A  flow  chart  exhibiting  the  basic  data 
taking  procedure  is  presented  in  Figure  1.  The  data  have  been  subsequently 
reduced  to  cumulative  probability  curves  by  means  of  the  movement 
generator  technique  discussed  in  this  paper.  All  parameters  are  normalized 
with  respect  to  the  radius  of  the  target, 

A  preliminary  analysis  using  six  rounds  distributed  with  a  normalized 
standard  deviation  of  0.  5,  and  a  destructive  component  radius  of  0.45  is 
shown  in  Figure  8.  This  curve  exhibits  the  relative  change  in  the  average 
area  of  destruction  for  each  displacement  from  the  center  and  for  which  it 
was  observed  that  small  displacements  had  little  effect  on  the  area  coverage. 

In  order  to  get  a  more  detailed  view  of  the  results,  another  set  of 
patterns  was  investigated.  During  this  investigation  the  distribution  error 
was  specified  in  terms  of  circular  probable  error  (CPE  =  1.177<r),  but 
the  same  circles  of  destruction  (R^/R^  =  0.45)  were  employed  as  in  the 

preliminary  investigation.  Figure  9  exhibits  the  family  of  cumulative 
probability  curves  as  functions  of  area  coverage  resulting  from  the  set 
of  more  than  50  patterns.  Each  curve  in  the  family  specifies  the  displace- 
ment  of  intended  aim  point  in  term*  of  0.  2  of  the  target  radius,  i.  e,  ,  each 
curve  represents  a  shift  of  aim  point  by  20%  of  the  target  radius  from  the 
target  center. 

In  considering  these  several  curves,  their  similarity  and  ordering 
is  as  would  be  expected.  It  must  be  pointed  out,  however,  that  for 
displacements  less  than  30%  of  the  target  radius,  the  fall  off  in  area 
coverage  is  small.  To  further  clarify  this  point,  Figure  10  illustrates 
the  area  coverage  as  a  function  of  the  aim  point  displacement  for 
confidence  levels  of  10,  25,  50,  75,  and  90%. 
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Data  for  determining  the  optimum  number  of  rounds  to  be  deployed 
against  a  target  was  obtained  on  the  SADI  Mark  IV  according  to  the 
scheme  exhibited  in  the  flow  chart  Figure  6.  At  the  time  of  writing 
this  report  all  of  the  desired  data  had  not  been  generated  and  the  analysis 
is  not  complete.  Several  sizes  of  destructive  or  lethal  components  and 
dispersion  errors  are  under  investigation  with  the  hope  of  determining 
optimum  parameters. 


THE  VARIABILITY  OF  LETHAL  AREA 


•  Bruce  Barnett 
Uata  Processing  SyaieniB  Office, 
Picatinny  Arsenal,  Dover,  New  JerBey 


The  purpose  of  this  paper  is  to  describe  a  statistical  model  that 
estimates  the  variability  of  lethal  area  when  fragment  mass  and  initial 
fragment  velocity  are  allowed  to  randomly  vary  between  specified 
limits.  Prior  to  this  development,  the  general  lethal  area  equation 
will  be  derived  to  illustrate  the  nature  of  the  equations  involved  and 
to  show  the  assumptions  made  in  its  derivation. 

The  lethal  area  concept  is  usually  applied  to  anti -personnel  muni¬ 
tions  that  are  of  a  fragmenting  nature  such  a»  bombs,  mines,  grenades 
and  shells.  The  lethal  area  is  a  number  that  yields  a  measure  of 
effectiveness  of  the  particular  munition  under  investigation  -  the  larger 
the  lethal  area  the  more  effective  the  weapon.  The  usual  mathematical 
definition  is  the  following:  "The  lethal  area  of  a  weapon  is  that  number 
which  when  multiplied  by  a  constant  density  of  targets  will  yield  the 
expected  number  of  incapacitations".  Figure  1  illustrates  a  typical 
situation. 

Shown  here  is  a  shell  bursting  over  some  area  A  containing  N 
targets,  uniformly  distributed.  Let  h  be  the  height  of  the  shell  at 
detonation,  u  its  angle  of  fall,  and  8^,  8 ^  the  zone  angles  within 

which  fragments  are  ejected.  These  fragments  are  to  incapacitate  as 
many  targets  as  possible.  Let  the  position  of  each  target  temporarily 
be  known,  the  coordinates  of  the_ith  target  being  (X^  ,  Y^).  The  density 

of  targets  is  the  ratio  N/A.  The  lethal  area,  (A^)i  can  be  written 
(1)  al  *  V  E[Nc> 

so  that  multiplying  A  by  pT  yields  E[N  ]  according  to  the  definition, 
Here  C 


N  =  random  variable  »  the  number  of  casualties 
c 

and  E[NcJ  is  the  expected  value  of  Nc> 
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The  lethai  area  equation  is  not  useable,  however,  in  the  general 
form  of  equation  (1).  To  refine  this  equation  let 

Y,  =  random  variable  =  1  if  the  ith  tiirget  is  incapacitated 

=  0  otherwise  . 


The  number  of  casualties  can,  therefore,  be  expressed  as 


(2) 


N 

N  =  1  Y. 
C  i=l  1 


Defining  as  the  probability  that  Y^  =  l,  it  follows  that 

i 


(3) 


A  N  .  N 

A  =  —  2  e[Y  1  =  —  E  P 

L  N  .  ,  1  iJ  N  .  *K. 

i=l  x=l  i 


Refining  the  lethal  area  equation  further  let 

N,  *  random  variable  =  number  of  fragments  striking  the  ith 
target 

^HK  “  rant^om  variable  =  probability  that  any  one  fragment 
i  striking  the  ith  target  incapacitates  that  target 

X^  s  random  variable  *  1  if  the  jtli  fragment  to  strike  the  ith 
J  target  is  the  first  fragment  to  incapacitate  that  target 
=  0  otherwise. 

(i)  iB 

Expressing  Y,  in  terms  of  the  X.  ,  yields 
^  J 


Ni 


(4) 


Y.  -  S  X 
1  j=l  ^ 


(i) 


Applying  equation  (5) 
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it  follows  that 
E[Y.j  = 


E[Y.]  =  E[E[Yi  |  PHK  ,  N.]] 


Ni 

E[E[  2  X^  | 

PHK,  ’  Ni  ^  ^ 

j=l  J 

l 

N. 

l 

E(  E  1  •  Prob 

[>!“") 

j-1 

N. 

(ft) 


'  £t  *  PHK.  1 

j^l  1  l 


Summing  the  geometrical  series  in  the  latter  equation  produces 

(7)  E[Y.]  =  E[1  -  (1-PHK  )  Ni] 

i 


Using  the  Poisson  distribution  as  an  approximation  to  the  binomial, 
equation  (7)  becomes 

'  “NiPHK. 

(8)  E[Y.]  *  E[l-e  *]  . 

This  equation  is.  further  approximated  as  follows 

-E[N  ]  E  [PHKJ  * 

(9)  E(Y.]  *  1  -  e  1 

This  is  equivalent  to  expanding  1-e  X  about  the  point  E[X]  ,  X  = 

and  using  the  first  term.  : 
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Letting 


(’•0)  Ni  =  Pi  ’  Ap 


where 

p^  =  random  variable  =  density  of  fragments  at  the  ith  target 

Ap  =  random  variable  =  presented  area  of  the  ith  target 
i 

then 

(11)  ElNj]  *  E[Pl]  E  [Ap  ] 
so  that  finally 

(  '  N  .  N  -E[pj]EtAp  ]  ] 

(12)  A  =  ^  E  E  [Y  ]  *  ~  S  1-e  l 

i=l  1  i=l 


This  equation  can  be  used  when  the  targets  are  at  predetermined 
positions  and  should  yield  a  good  estimate  of  the  lethal  area.  This  is 
.so,  because  the  known  target  locations  enable  reasonable  estimates  for 
E[p.]  ,  E(A  ]  ,  and  E[PUI.  ]  to  be  assigned.  Data  for  P,-„.  ,  the 

probability  that  a  random  fragment  striking  the  ith  target  incapacitates 
that  target,  can  be  obtained  experimentally  depending  in  part  on  the 
mass  and  striking  velocity  of  the  fragment. 

In  a  tactical  situation,  however,  the  target  coordinates  are  rarely 
known  and  it  is  desirable  to  obtain  an  analogous  lethal  area  equation  to 
handle  this  typical  case.  To  accomplish  this,  let 

E[Y,  j  ■  P,,  where  now  P„  is  a  function  of  (X. ,  Y.),  X.,  Y. 

1  J  1  1  1 

'  being  random  variables  defining  the  coordinates  of  the  ith  target. 


■ . . "•■  ■ 


•'•«  *!>:#/ t:v 


162 


Design  of  Experiments 


In  this  case 

E[Yt]  *  E[E[P^  |  Xit  Y{]  ] 
‘  (12)  =  E[Pk  (X. ,  Yi)] 


*jTAPK<Xi'Yi>£<Xi’  V  dXidYi 

Since  a  uniform  density  of  targets  is  assumed 


(13)  f(Xl(  Yi)  =  ~ 

so  that 

«4)  EtYii  =  r  jT A  pK'xi' Yi> 


Substituting  this  equation  in  the  lethal  area  equation  produces 


(15) 


A  a  £ 

al  n 


(IH 


N 

£ 

i=l 


kWV 


Each  of  these  N  integrals  are  identical,  so  that 

<16>  AL  ■  K-  •  X  N  y  i’/KdXdY  ■  S  j'APKdXdY 


This  is  the  usual  lethal  area  equation.  It  can  be  evaluated  by  judiciously 
selecting  various  points  in  the  groundplane,  evaluating  at  these 
points  and  numerically  obtaining  the  value  of  the  integral. 
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This  equation,  however,  does  not  allow  lor  any  ol  the  parameters 
to  be  randomized;  that  is,  it  cannot  be  used  directly  to  ascertain  the 
variability  of  lethal  area.  Before  describing  the  statistical  model,  it 
is  worth-while  to  state  several  reasons  for  analyzing  the  variability 
of  lethal  are?  Some  are; 


1.  A  quantitative  measure  of  the  variability  of  lethal  area  due  to 
specific  parameters  is  provided.  A  possible  application  of  this  is  for 
establishing  tolerances.  For  example,  there  are  controlled  and  un-con- 
trolled  variables  associated  with  a  shell.  Fragment  breakup  and  explosive 
weight  being  somewhat  controlled,  burst  height  (for  an  air  burst)  and 
angle  of  rail  being  uncontrolled.  Tendencies  exist  to  maintain  tight 
tolerances  on  variables  that  can  be  controlled  even  at  more  expense. 

In  lieu  of  the  variability  induced  by  parameters  that  cannot  be  controlled, 
these  possibly  tight  tolerances  may  possibly  be  relaxed  without  signifi¬ 
cantly  affecting  the  overall  effectiveness.  Conversely,  variability  can 
'point  out  those  parameters  that  need  be  better  controlled  to  assure 
more  uniform  effectiveness. 


2.  Variability  affects  the  design  of  optimum  rounds.  Briefly, 
rounds  should  not  be  designed  to  produce  high  effectiveness  under  ideal 
burst  conditions,  but  decrease  sharply  in  effectiveness  when  variations 
from  these  ideal  conditions  are  present. 


3.  Variability  analysis  permits  probabilistic  bounds  to  be  placed 
on  the  number  of  casualties.  For  example,  it  may  be  advantageous  in 
some  situations  to  have  a  minimum  assurance  level  for  incapacitating 
at  least  P%  of  the  targets. 


To  study  the  variability  of  lethal  area  in  the  most  general  case 
would  first  necessitate  establishing  the  independent  random  variables 
and  those  quantities  in  the  lethal  area  equation  that  depend  on  them. 
For  example,  one  may  write 


(17)  Al  (l-e”^"^6,  v0),h'ei>  02’u>  Vh^PHK( 


(m,Vo,9,a))dA 


Here,  it  is  assumed  that  the  density  p  of  fragments  depends  on  the 
mass  breakup,  which  in  turn  depends  on  the  intial  fragment  velocity 
V  ,  and  the  angle  measured  off  the  nose  of  the  shell.  The  burst  height, 
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•pray  angles  and  angle  of  fall  also  affect  the  density  of  fragments  at  a 

selected  target  Similarly,  for  A  and  Pr7t,.  As  a  first  analysis, 

y  riiv 

however,  several  simplifying  assumptions  will  be  made.  Some  of  the 
assumptions  are  somewhat  unrealistic;  for  example,  the  drag  coefficient 
a  is  assumed  independent  of  the  fragment  mass.  It  is  for  this  reason 
that  the  results  from  this  analysis  should  not  be  strictly  interpreted. 
However,  what  may  be  of  Importance  is  to  see  how  well  the  statistical 
model  estimates  the  variability,  for  then  in  the  favorable  case,  the 
possibility  exists  of  generalizing  the  model  to  include  more  realism. 

The  assumptions  used  in  this  analysis  are  listed  below. 

1.  Only  the  fragment  mass  m  and  initial  fragment  velocity  V  will 
be  considered  as  random  variables.  This  means  that  .the  burst  height, 
angle  of  fall,  weight  of  fragmenting  material,  etc.  ,  are  precisely  known 
in  advance. 

2.  A  90®  fall  angle  is  assumed. 

3.  The  fragments  are  all  of  the  same  mass  and  initial  velocity, 
although  the  particular  m  and  are  random  variables. 

4.  m,  V  are  Independent  random  variables,  both  uniformly 
distributed,  a,  the  drag  coefficient  is  independent  of  both  m  and  V^. 

5.  Inverse  square  law  for  density  is  assumed. 

6.  A  ,  the  presented  area  of  a  target  is  a  known  function  of  h  and 
R  (R  is  thS  ground  range  to  the  target  under  consideration), 

7.  P„K  is  specified  by  an  exact  formula  given  as  a  function  of  m 
and  V,  the  Striking  velocity. 

8.  The  maximum  effective  range  of  a  fragment  depends  on  m  and 
EIVJ  . 

As  a  result  of  these  assumptions,  one  may  write 

c  Rmax  *p{m)‘  E[A  ]  -PHK(m,  V) 

CIS)  A  =  2ir  \  (1  -  e  p  )RdR  . 

^  J0 
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This  i:  the  leth*'  area  equation  written  in  polarcoordinates,  making  use 
of  the  fact  that  u  =  90°  yields  radial  symmetry.  The  density  p  is 


(19) 


P  8 


U> 


2irmr  [cosS^-cos  0^  ] 


where  r  is  the  range  from  the  burst  point  to  the  target  under  considera¬ 
tion  and  w  =  weight  of  fragmenting  material.  The  relation 


(20) 


N, 


u 

m 


Nf  being  the  number  of  fragments,  was  used  in  obtaining  equation  (19). 


A  *  f  (h,R)  a  known  quantity 
P  ■ 


-aRm 


1/3 


rHK  •«,(».  v),  v.vt. 

A  typical  plot  of  PHK  is  shown  in  Figure  2. 
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Note  that  a  certain  cut-off  point  A  exists  such  that  for  tj(m,V)  <  A, 

P  „  ■  0.  P  „  is  non-differentiable  at  this  point  A. 
fixv  HK 

To  obtain  the  variability  of  lethal  area,  it  is  first  convenient  to  reduce 
the  integral  form  of  the  lethal  area  equation  to  one  that  is  purely  algebraic. 
This  is  accomplished  by  selecting  a  numerical  scheme  to  evaluate  the 
integral.  In  this  case  the  Trapezoidal  rule  was  used,  Thus 

M 

(21)  A.  =  2ir  A  R  Z  R.P„ 

L  i=l  1  Ki 


Here  it  is  assumed  that  Rq  =  0  and  m  is  so  large  that 

(22)  PKm  ■  PK( V  =  0  • 

Clearly 

123) 


and 

2  2  M 

(24)  V[A  ]  =  4tt  (AR;  V[2  R  P„  ] 

L  i*l  ‘  Ki 


M 

E  [A.  ]  »  Zir  A  R  S  R.  E  [P__  ] 
L  i=l  1  Ki 


The  latter  equation  can  be  ^>ut  in  the  more  convenient  form  by  employing 
equation  (25). 


-i ;  - - - - - . . . . .  ..  .,-*.T*  S  .*•-  ?■.  *•-»■•.■  i**  •»«■-»»  ;*•«*■  1..  — -r-  - 
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M  M  M 

Via  K  P  J  =  Etr  R  P  -E[Z  R  P  ]  ] 
i»l  *  *1  i=l  1  Ki  i=l  1  Ki 


(25) 


M 


Z  R“  V[P  ]  +  2  Z  R  R  (E[P  •  P  ] 

i=l  1  Ki  i<j  j  Ki  Kj 


E[PK  ]  •  E  [P  ]  ) 
Ki  Kj 
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Since  V[PK  ]  =  E  [  [  *K  -=  [PK  ]  ] 
l  i  1 


equations  (23)  and  (24)  require  only  that  expressions  of  the  form 

E[Pk  ]  ,  E  [P  2]  ,  and  E[P  •  P  ]  to 
Ki  Ki  Ki  Kj 

be  evaluated  to  ascertain  the  end  result.  This  is  described  next. 

Following  Reference  1  a  logical  method  of  proceeding  would  be  to 
expand  P^  in  a  Taylor  series  in  the  Independent  random  variables 
Ki 

m  and  V  . 
o 

Thus 


PK  *  PK  *  PKm*m"m)  +  PKV  *  W 

o 

*kl  +  2  PKmV  (■»•“)  (V0-Vo) 

(27)  0 

+  PKVV  <V~/> 

0  0 

In  the  right  members  of  (27)  P^,  PKm  ft  dP^/dm,  ....  are  each  under¬ 
stood  evaluated  at  the  point  (fFI,  V0). 
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Here  the  subscript  i  has  been  omitted.  Unfortunately  a  deficiency  in  the 
above  expanaion  exists  in  that  PK  may  take  on  negative  values  and  thus 
becomes  meaningless.  This  is  illustrated  in  Figure  3. 


In  addition  a  second  deficiency  results  when  using  this  expansion  in  that 
for  any  given  initial  velocity  VQ  there  is  a  corresponding  mass  (sm^) 

where  PK  is  non -differentiable.  This  arises  from  the  point  of  non-dif- 
ferentiabiiity  in  the  P^  equation  which  is  subsequently  carried  through 

to  the  P„  equation.  (By  assumption  (8)  E[V  ]  will  be  used  instead  of 

8\  O 

the  random  variable  V  to  determine  m  ). 

o  o 

Both  difficulties  are  overcome,  however,  if  one  forms  two 
separate  expansions  for  P^.,  namely; 

for  mjs  m 

0 


(28) 


for  m<m 


T:  £  *<vc 

n»o 


V  )  -2- 

V  ev 


)n  p 


K 


P  SO 
K 
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where  for  algebraic  simplicity  m  is  chosen  V>y 
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(29) 

m  s  E  (m  )  mi  m^(R)  ] 

so  that 

E  [m-m  |  m  2  mQ(R)  ] 

=  0 

since 

(30) 

E[Y]  ■  E[Y  |  X  <  Xq]  •  Prob 

[x<  x„) 

+  E  [Y  |  X  ^  X  ]  •  Prob  | 

X  i  X  } 
*  o  * 

One  may  write  for  PR 

E'PK]  -(PK+J  PKmz»  E  1  2 

(31) 

+  ?  PKV  V  1  [Vo-VJ  ‘  I  m*  mo>  •  Prob  (m  ‘  mol  ■ 
o  0 

Similarly  , 

(32)  V[PK]  =  E[  [  PK  -  E[  PK]  ]  2  |  m  g  mQ]  Prob  (mimj 


each  term  of  which  is  easily  evaluated. 

The  covariance  terms  are  handled  by  expanding  P P  about  a 

i 

selected  point  and  formally  taking  the  expected  value  of  tne  product. 
For  example 


wseii*  ■■iw« 
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PK  PK  =  <VPK.  +  PK.rim”m  )  +  •  1  1  1 
*  i  *  i  J 


*  PK.m  (m"m  >  [PK,  +  PK  m  (m-m  >  +  • '  •  1 

1  j  j 


+  . . . )  •  Prob  £  m  >  mQ  (R^)  | 


so  that 


EfP  P  1  =  P  P  +...+P  P 
l  K.  *K J  Kj  Ktm  *K  m 


+  • . .  Prob  ^  m  >  j 


E  [m-m'  ]  2 


For  the  uniform  distribution 


«x>  ’ITT 


it  is  easily  verified  that 

E[X-X]  =  0 
E[X.X]2  =  SSjpL 

E[X-X]  3  =  0 

*DcJ5 4  ■  ■&*£ 

Equations  (23)  and  (24)  are  thus  completely  determined. 

These  equations  were  used  to  obtain  numerical  estimates  for  the 
variability  of  lethal  area.  These  results  were  compared  with  a  Monte 
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Carlo  evaluation  whereby  m  and  Vq  were  sampled  from  their  respective 
uiaix-xuutiuxis.  The  results  ox  the  comparison  are  shown  in  Table  1. 


TABLE  1 


E[AJ 

<r[  AJ 

STATISTICAL 

MODEL 

4519 

178.9 

MONTE  CARLO 

MODEL 

4516 

179.8 

%  Difference  in  E  [AjJ  =  .  06% 
%  Difference  in  <r  [  A^Jj  =  .  50% 


In  a  'certaining  these  results  the  following  variances  were  assigned; 
V[m]  =  .75  and  V[Vq]  »  75,  also  200  simulations  were  used  in  the 

Monte  Carlo  evaluation. 

The  next  Table  shows  a  comparison  of  E[PK)  *nd  V[PK]  at  selocted 


TABLE  II 

=[PK]  V[PX]  x  103 


RANGE 

M.C.* 

S.M. 

M.C.  * 

S.  M. 

io 

,9366 

.9370 

.1572 

.1658 

20 

.  5482 

.5491 

.  6372 

.6667 

30 

.2967 

,2974 

.2839 

.2958 

40 

.1736 

.1740 

.1068 

.1112 

50 

.1103 

4105 

.0429 

.0447 

100 

;0222 

.0222 

.0011 

.0012 

200 

.0028 

.0028 

0 

0 

270 

.0008 

.0008 

.  0 

0 

300 

.0004 

.0004 

. 00001 

. 00001 

370 

0 

0 

0 

0 

*Based  on  a  sample  size  of  100 
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A  review  of  these  results  indicate  that  the  statistical  model  dees 
provide  a  good  approximation  to  the  variability  of  lethal  area  -  a  method 
which  may  possibly  be  generalized  to  include  more  realism  in  the  model. 

It  is  of  interest  to  note  that  the  covariance  terms  contributed  as  high 
as  87%  of  the  total  variance  of  lethal  area.  The  final  table  also  included 
for  interest,  shows  the  effect  of  step  size  used  in  the  numerical  integra¬ 
tion  scheme  on  the  results. 


TABLE  III 

R  =  5 

R  =  10 

R  =  20 

E[Al] 

4554. 

4519- 

4304. 

"Eal1 

179.4 

178.9 

185.  5 

In  summary,  therefore,  it  is  not  the  numerical  results  of  this  paper 
that  should  be  emphasized,  but  rather  the  possible  application  of  a 
straight-forward  technique  to  a  complex  problem  involving  the  variability 
of  lethal  area. 
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DECISION  PROCEDURE  FOR  MINIMIZING  COSTS 
OF  CALIBRATING  LIQUID  ROCKET  EInGIInjlS 

Sidney  H.  Lishman  and  E.  L.  Bombara 
Engine  Program  Office,  Marshall  Space  Flight  Center 


SUMMARY:  Prior  to  acceptance  of  a  liquid  rocket  engine  for  use  in 
Saturn  vehicles,  the  average  thrust  of  two  consecutive  teste  without  an 
intervening  calibration  must  satisfy  specification  requirements.  The 
contractor  may  recalibrate  after  the  firet  and  subsequent  tests  if  he  so 
chooses,  based  upon  decision  limits,  until  the  above  requirement  in  met. 

This  paper  provides  a  method  for  calculating  decision  limits  such 
that  the  total  number  of  tests  required  for  acceptance  is  minimised.  The 
model  for  calculating  the  decision  limit  takeB  into  account  operational 
reliability  and  life  of  the  engine,  ratio  of  cost  of  testing  to  cost  of  an 
engine,  and  correlation  between  tests  as  a  function  of  engine -to •engine 
and  run-to-run  variance  components. 

INTRODUCTION.  One  of  the  requirements  for  NASA  acceptance 
of  a  Saturn  vehicle  engine  is  that  the  thrust  averaged  from  two  successive 
tests  without  an  intervening  calbration  fall  within  specification  limits. 

In  the  past,  most  engines  were  accepted  from  the  contractor  after  three 
tests,  but  when  the  specification  was  recently  tightened  it  was  estimated 
that  more  than  50%  of  all  engines  would  have  to  be  tested  at  least  four 
times  prior  to  acceptance.  Their  increase  in  number  of  testa  per  engine 
represented  an  appreciable  increase  in  costs. 

This  paper  presents  the  results  of  a  study  made  to  determine  what 
could  be  done  to  reduce  acceptance  testing  costs  when  the  specification 
limits  are  held  constant. 

DISCUSSION.  Engine  testing  is  conducted  in  accordance  with  the 
following  ground  rules  until  the  engine  meets  acceptance  requiremtns 
or  until  it  is  scrapped: 

1.  If  thrust  in  a  test  following  a  calibration  is  outside  certain 
decision  limits,  the  engine  is  successively  recalibrated  and 
tested  until  thrust  falls  inside  the  decision  limits. 
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2.  If  thrust  in  a  test  following  a  calibration  is  inside  the  decision 
limits,  no  changes  arc  made  to  the  engine  uiu  auuiher  test  is 
conducted  in  an  attempt  to  satisfy  acceptance  requirements. 

3.  If  the  average  thrust  from  two  consecutive  tests  without  inter* 
vening  calibration  falls  outside  of  specification  limits,  the 
engine  is  recalibrated,  and  the  test  cycle  is  repeated. 

It  should  be  pointed  out  that  the  value  in  using  a  procedure  such  as 
described  below  is  greatest  when  specification  limits  are  tight.  If 
specification  limits  are  very  wide,  there  is  not  much  point  in  using 
decision  limits  at  all,  because  the  need  for  recalibration  becomes  remote. 

ILLUSTRATIONS.  For  the  purpose  of  applications  herein,  the 
following  assumptions  were  made: 

1.  The  engine  is  always  calibrated  after  the  first  test  (due  to  high 
variability  of  thrust  prior  to  the  first  calibration). 

2.  There  is  no  bias  introduced  in  calibrating  the  engine. 

3.  After  the  initial  calibration,  ability  to  recalibrate  does  not 
improve  between  tests. 

4.  Cost  of  calibration  is  negligible  compared  to  cost  of  a  test. 

3.  The  engine  is  scrapped  after  N  tests  that  do  not  satisfy  the 
criterion  for  acceptance  described  above. 


6.  The  engine -to -engine  and  run-to-run  variance  components, 
^EE  an^  ^rr  >  respectively,  are  known;  the  mean  thrust 

is  also  known. 

2 

7.  or  is  the  same  for  all  engines. 

8.  Engine -to -engine  and  run-to-run  deviates  are  normally  and 
independently  distributed. 


The  models  described  below  can  easily  be  altered  to  change  assumptions 
1  through  5. 
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Two  models  are  considered  herein: 
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1.  Assume  the  engine  is  scrapped  after  nine  unsuccessful  tests,  and 
operational  reliability  *1,0.  Operational  reliability  is  defined 
as  one  minus  the  probability  of  any  failure  (hardware,  facility, 
human  error)  that  causes  a  single  additional  test  and  calibration. 
Assume  that  the  cost  of  scrapping  an  engine  is  equal  to  the  cost 
of  40  tests. 

2.  Assume  the  engine  is  scrapped  after  5  unsuccessful  tests  and 
operational  reliability  i  1.0. 

Common  to  all  models  generated  under  the  above  assumptions,  we  define 
the  following  probabilities  (figure  1): 

Let  P(i)  be  the  probability  of  thrust  exceeding  the  decision  limits  in 

the  i*®  test. 

Let  P(i)  be  the  conditional  probability  that  the  mean  thrust, 

(X^+Xi+j)/2,  of  the  1th  and  (i+1)*11  tests  exceeds  the  specification 

limits. 

It  is  assumed  that  P(i)  is  the  same  for  all  i,  and  that  Pff)  Is  the  same  for 
all  i,  Assuming  normality,  P(i)  and  p( I)  may  be  calculated  from  the  ' 
bivariate  normal  density  as  Illustrated  in  figure  1. 

P(i)  and  p(i)  may  be  obtained  from  equations  (1),  (2),  (3)  below  by 
using  any  table  of  the  bivariate  normal  distribution,  such  as  reference 
(1).  It  is  convenient  to  express  the  correlation  coefficient  as  a  function 
of  the  run-to-run  and  engine-to-engine  variance  components,  because 
of  the  advantage  gained  by  utilizing  all  pertinent  data.  From  the  appendix, 
the  standard  deviation  of 


The  correlation  coefficient  between  X^  and  (X^+X^)/2  is; 
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(2) 


<Vxi+i>/2 


The  standard  deviation  of  (X^+X^)/2  is: 


r 


(3) 


<r 


(X1+Xi+1)/2 


'xi  ^  •  <Xi+Xi«>/2 


MODEL  I-,  Reliability  =  1.0;  engine  is  scrapped  after  9  unsuccessful 
tests. 

Let  the  notation  11 2  3"  describe  the  event  that  thrust  of  the  second 
test  was  within  decision  limits  and  that  the  average  thrust  of  the  second 
and  third  test  was  within  specification  limits.  Let  the  notation 
"2  3  4  5  describe  the  following  event: 
c  c  . ,  ,  ■< 

*  Calibration  after  second  test  (thrust  outside  of  decision  limits), 

*  No  calibration  after  third  test  (thrust  within  decision  limits). 

*  Calibration  after  fourth  test  (mean  thrust  of  third  and  fourth 
test  outside  of  specification  limits). 

‘  '  *  .  •  ’  ' 

*  Thrust  in  fifth  test  within  decision  limits. 

*  Average  thrust  of  fifth  and  sixth  tests  within  specification  limits. 

•> 

Using  this  notation  and  the  notation  of  figure  1,  probabilities  for  the 
various  events  are  as  follows; 
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TABLE  1 


tVjjjW 

PROBABILITY 

1  J 

1,.  1  * 

Ci-p<t>.p<7i]  p(t> 

1  5,  »  I 

[i-p<1).p<I)]p(7>[i.p(1)] 

J*  J.  *  J 

[i.p(t).«7j)  [«,f 

»  S,  *,  l  I 

[l-PC^-Pd)]  p<t)  P0  [1-P(i>] 

3.  3  V3  * 

[i-p^J-pQ]  p(t)  P(T)  [l-P(t>] 

2.  3«*e3  '* 

[i.p<i).p<7)]  [p^)]3 

1  V  w 

£i-p<t>.p<ig  [p<702  Q-p<1>32 

3  *.VV*  T 

[i.Kt).pQ]  [«,)]’  «i>[i.p<tg 

3„  3  *c  >.  *  T 

|j-p<,>-p(7>][«t>33  «I)t»-p<i)] 

3«  V  *  V  1 

jj.Kl).pQ][«1)]3 

3.  3«  *.  ’« •  T 

[l.Kt).«P][p<i>]* 

1  I4  4  Te  6,  7  I 

|l-«t>-P0]  Ki)  .[PQK.-K.a3 

3  ><*,•  V  1 

&.«,>-«!>]  KtJ  (><D]S[l.Kt)]3 

3  *e  «,  3o  7  * 

[l,K,)-pQ]i«j>]3  p(J)  [l.«t>] 

3.  3  *.  3  \  7  » 

[l.Kt)-Kt)]  P<i> '  [«Tf  [l-Kt)]2 

3c  3  *«»,*„  7  * 

[l.p(l).p(7)]  [p(i’>]3  p Q  [l-K,)] 

3«  V  Sc‘c  7  1 

B-«l).pQ]  [M,)]3  «J)  [UK,)] 

3«  3e  \ 3  *. 7  * 

[i.p(I>.pQ][p(1>]3  p(J) 

3«  3,  s.  V  1 

[l.Ki>.pq)][K,)]) 
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TABLE  I  (Ccnr.»d) 


VWNTfi 

PROBABILITY 

2  lc  4  1c  6  7C  8  ? 

D  -  f<tJ  •  '<70  t*Qf  U  -  r<i>2J 

2  7  4  1e  6  7  8  ? 

C  C  c  c 

D  -  '<i>  -  '<?]  BC)]1 0  *  '<,>]2 

J  l  «  J  J,  j  n 

C  c  c  c 

D  •  '<i>  -  '<70  -  '‘JJ2 

2  14  5  6  7  8? 

c  C  C  c 

[‘•'V-'Qlt'yWC'-'rf 

2  J  4  .  5  6  7  8  ? 

r  C  a  C  C 

D  -  'V  -  '<70  L  t(t>]4  '<7>  [>-'<i>] 

w  w  * 

2  3  *  3  1  7,  8  ? 
c  c  c  c 

D  -  '<i>  -  '<70  [  '<t>]2L'<7>72  D  -  »<,fl2 

2  3  *  3  6  7  8  ? 
c  c  c  c 

[i  -  nt>  ■  '<7>]  ['<1)]2['<7>]!  t1  -  '<i>]! 

23*5678? 
c  c  c  c  c 

[1  ■»(!>.  P<7>]  [«,>]4  lQ  [l  -  *<!>] 

2  3  4  3  6  7ce  ? 
c  c  c  C 

[i  -  ?<!> .  pQ]  [»<l>]2['<7>]2  [i  •  .»<*>]* 

2  3  4  3_  6A  7  8  7 
o  c  c  c  c 

[l  p(4>  -  pQ]  [  p(i>]  p<7>  [t  -  p(t)] 

2  3  4  3  3  7  8  ? 
c  c  o  c  c 

D  -  "P  •  '<?]  [  '<!>]*  '<7>  Cl  •  '<!>] 

2  3  4  5  6  7  8  ? 

C  c  c  c  ® 

&•«!>•  '<?]['<!>]* '9  [1  -'<!>] 

2  3,  K  K  6  7  87 

c  C  c  C  c  C 

D  -  «i>  •  '<!>]  [  '<i>]‘ 

-SfM 
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Assume  that  the  cost  of  one  engine  is  equivalent  to  the  cost  of 
M  *  40  tests.  Let  F.  be  the  sum  cf  probabilities  in  table  l  associated 

J 

with  those  events  requiring  j  tests.  Then  the  expected  number  of  tests 
per  accepted  engine  is; 


N 


N 


(4) 


E(N) 


E  )P  4  M(1  -  E  P  ) 
JO  _  J  -JO  f 
F5 


2 

JO 


P 


J 


The  quantity  in  parenthesis  is  the  probability  that  more  than  9  tests 
are  required;  i.e.  ,  the  probability  of  scrapping  the  engine.  Holding  the 
specification  limit  constant,  the  decision  limit  (figure  1)  is  varied  until 
E(N)  is  minimized. 

In  illustration,  this  model  was  used  to  support  contract  negotiations 
in  an  engine  program  where  reliability  of  the  engine  is  very  high. 
Practice  is  to  scrap  the  engine  after  9  unsuccessful  tests.  Data  showed 
that  the  square  root  of  the  within -engine,  or  run-to-run  variance 
component  of  thrust  Was  600  lbs. ,  and  the  square  root  of  the  engine -to  - 
engine  variance  component  was  between  1200  and  1500  lbs.  Both 
extremes  were  analysed,  as  follows: 


siai  »se  * 

From  equation  (1), 
From  equation  (2) 

From  equation  (3) 


1200  lbs.  irRR  »  600  lbs. 

<rv  =  V( 600)2  +(1200)2  =  1340  lbs, 

“xi'  (xi +  xi+i>/'2  •  jna&) 

<r(Xi  4  Xi+1)/2  =  .95(1340)  =  1270  lbs. 


Suppose  the  specification  limits  for  thrust  are  nominal  4  2000  lbs. 
Then  the  number  of  standard  deviations  between  nominal  and  the  specifi¬ 
cation  limit  (two-sided)  is  2000/1270  =  1.  57,  By  trial  and  error,  equa¬ 
tion  (4)  is  minimized  when  the  decision  limits  are  nominal  4  1.  7(1340)  = 
nominal  4  2200  lbs,  when  W(N)  a  [3. 178  4  40(.  0020)] /.  998  ~  3.  26  tests 
per  accepted  engine. 
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Ca.se  2:  =  1500  lbs,  (T„„  -  600  lbs. 

EE  RR 


From  equation  (l), 


From  equation  (2), 


From  equation  (3), 


j-~  —————— — - - 

<rx  =  V(600)2  +  (1500)2  =  1620  lbs. 

i  i  / 6oo  r  ^ 

(X.  *Xi+1)/2  '  V  *  2  [1620/ 

<r,Y  av  \/?  *  .965(1620)  =  1560  lbs. 
{Xi  +  Xi+l,/?’ 


-  .965 


The  number  of  standard  deviations  between  nominal  and  the 

specification  limit  (two-sided)  is  2000/1560  =  1.  28.  By  trial  and  error, 

equation  (4)  is  minimized  when  the  decision  limits  are  nominal 

M.  5(1620)  a  nominal  +  2430  lbs,  when  E(N)  =  [3.286  +  40(.  012?.)]  /.  988 

=  3.8  tests  per  accepted  engine.  (Note  that  changing  the  ratio  of 

<r  /cr  from  600/1340  in  case  1  to  600/1620  In  case  2  changes  the 
RR  J\ , 

correlation  coefficient  by  only  .015,  and  merely  changes  the  optimum 
decision  limits  from  1.  7  to  1.  5  standard  deviations.  E(N)  changes 
significantly,  from  3.  3  to  3.8  tests  per  accepted  engine,) 


Other  Information  of  interest  corresponding  to  decision  limits  is 
the  following! 

N 


A.  Prob.  of  acceptance  after  N  tests  ■  £  P 

j*3 


J 


N 


B.  Prob.  of  scrapping  engine  after  N  tests  •  1  -  23  P 

j  =  3 


j 


C.  Percent  engines  requiring  calibration  After  second  test  =  P(i) 
Of  these,  the  four  "corners"  of  the  bivariate  distribution 
are  necessary  (see  figure  1), 


Prior  to  this  analysis,  the  contractor  had  been  using  arbitrary 
decision  limits  of  nominal  +  (2000-2  <r_ _ ) .  Advantages  gained  by 

minimizing  expected  number  of  teBts  are  also  obtained  from  A,  B, 
and  C  above,  as  follows: 
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COMPARISON  OF  DECISION  LIMITS 
ASSUMiJNU  THAT  AT  T Lit  'i  UNSUCCESSFUL  TESTS 
THE  ENGINE  IS  SCRAPPED 


<rEJ5  »  1200  lbs.  ffRR  =  600  lbs. 

_( Spec.  =  Nominal  +  1.  6  Sigma) 

(Assume  1  Engine  *’  Decision  Limit  Optimum  Dec.  Limit 

=  40  Tests)  a  nominal  +  0.6  <r  =  nominal  +  1.  7«rx^ 


•  Prob.  of  Acceptance 
after  3  Tests 

'  %  Engines  requiring 
calibration  after 
2nd  test 

■Average  number  of 
tests  required  for 
acceptance 


■Expected  Number  of 
Scrapped  Engines 
Per  100  Tested 


.45 


.  87 


55%  10% 

(of these(  20%  (of  these,  77% 

are  necessary)  are  necessary) 


4.1i(due  to  recalibration) 

3.18 

0.  61(due  to  scrapped  engine) 

0.08 

4.72  (Total) 

3.26 

A  s  1.  5  tests/engine 

1.  5 

0.2 

45.1 

N  a 

3 

86.  7 

69.9 

N  = 

4 

95.  5 

83.  5 

N  s 

5 

99.0 

90.0 

N  = 

6 

99.6 

95.0 

N  = 

7 

99.76 

97.  3 

N  = 

8 

99.79 

98.  5 

N  = 

9 

99.  80 

•%  Engines  Accepted 
after  N  Tests 
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COMPARISON  OF  DECISION  LIMITS 
ASSUMING  THAT  AFTER  9  UNSUCCESSFUL  TESTS 
THE  ENGINE  IS  SCRAPPED 


<r£E  =  1500  lbs. 

(Spec. 


<rRR  =  600  lbs. 
=  Nominal  jt  1 .  3  Sigma) 


(Assume  1  Engine 
=  40  Tests) 


Decision  Limit 
=  nominal  +  0.  5  o-^ 


Optimum  Dec.  Limit 
a  nominal  +  1.  5  irv 


Prob.  of  Acceptance  .38  .78 

after  3  Tests. 


%Engines  Requiring 

62% 

16% 

Calibration  After 

(of  these, 

31% 

(of  these,  84% 

2nd  Test 

are  necesi 

s  ary) 

are  necessary) 

Average  Number  of  Tests 

4.4  (due  to 

recalibration) 

3.  3 

Required  for  Acceptance 

1.4(due  to  scrapped  engine)  0.  5 

5.8 

(Total) 

3.8 

A  a  2,0  Tests/Engina 

Expected  Number  of 

Scrapped  Engines 

3.5 

1.2 

per  100  Tested 

%  Engines  Accepted 

38.0 

N  =  3 

78.0 

after  N  Tests 

61.9 

N  =  4 

90.  6 

76.4 

N  a  5 

96.5 

85.4 

N  a  6 

98.0 

91.  0 

N  a  7 

98.  5 

94.4 

N  a  8. 

98.7 

96.  5 

N  a  9 

98.8 
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MODEL  2;  Reliability  <  1.0;  Engine  is  scrapped  after  5  unsuccessful 
tests.  Assume  that  the  engine  is  scrapped  when  the  contractor  fails  to 
meet  requirements  for  acceptance  after  5  successive  teats  with  calibration. 
Let  1-Rj  be  the  probability  of  failure  in  the  first  test,  where  "failure”  is 
any  event  that  causes  a  single  additional  test  as  in  table  2,  and  similarly 
for  l-R^  in  the  seond  test,  etc.  A  curve  of  reliability  vs.  number  of  tests 
may  be  obtained  from  past  experience,  as  in  figure  2. 


Figure  2 

OPERATIONAL  RELIABILITY  VS.  NUMBER  OF  TESTS 


+ 

1 


Numbei  of  Tests 
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Let  the  notation  "1  2__  3  4  5  describe  the  following  event: 

c  F  2  c 

Calibration  after  first  test. 

Failure  during  second  test. 

Calibration  after  third  test. 

Thrust  in  fourth  test  within  decision  limits. 

Average  thrust  of  fourth  ana  fifth  tests  within  specification  liinits. 

As  before,  the  engine  is  always  calibrated  after  the  first  test, 
unless  failure  occurs.  Using  the  notation  P(i)  and  P(T)  as  in  model  1, 
probabilities  for  the  various  events  are  as  follows: 

TABLE  2 

PROBABILITY 
Rj  R2  R3  [1  -  P(i)  -  P{p] 

R1  R2  R3  R4  P(i  )  [1  " 
ti  -  rx)r2  r3  r4  [1  -  P(i)  -  P(T)] 

(1  -  RjjJRj  R3  R4  [1  -  P(i)  -  P(T)] 

R1  R2  R3  R4  R5  2  [1  ’  P(i)  _  P(r>] 

R1  R2  R3  R4  R5t1‘PH)l  Cp(“)]  [1  -  P(1)-P(D] 

(1  -  Rj)  (1  -  R2)  R3  R4R5  [1  -  P(i)  -  P(~)] 

(1  -  Rx)  (1  -  R3)  Rz  R4  R5  [1  -  P(i)  -  P'(0] 

(i  -  r2)(i  -  r3)  Rj  r4  r5  [1  -  P(i)  -  P(7)]  1 
(1  -  Rj)  R2  R3  R4  R5  [P(i)]  2  [1  -  P(i)  -  P(D] 
(i  -  r2)  rx  r3  r4  r5  P(i)  [1  -  P(i)  -  P(D] 

<1  -  r3)  r,  r2  r4  r5  P(i)  [1  -  p(i)  -  P (f)] 


EVENT 


1  2  3 
c 

12  3  4 
c  c 

Wc  3  1 

lc  W  7 

1  2  3  4  5 

c  c  c 

1  2  3  4  5 

c  c 

1fi2F23c  4  * 

LF1^c  3F34  * 

‘c  2F23F34  3 

1  2  3  4  5 

FI  c  c 

1  2_,  3  4  5 
c  F2  c 


1  2  3_,4  5 

c  c  F3 
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Assuming  that  the  cost  of  one  engine  is  equivalent  to  the  cost  of  M 
tests,  and  letting  P  he  the  sum  of  probabilities  in  table  2  associated 
with  j  tests,  the  expected  number  of  tests  per  accepted  engine  is  given  by 
equation((4). 

Case  1;  Reliability  <1.0 

In  illustration  suppose  r  -  1200  lbs.  ,  <r  =  600  lbs.  ,  specification 

EE  RR 

limits  are  nominal  +  2000  lbs.  ,  and  the  cost  of  one  engine  is  equivalent 
to  the  cost  of  40  tests.  Then  as  in  model  1,  case  1,  we  have: 

<r  =  1340  lbs. 

Xi 

”Xi-  <Xi  *  *»!>/*  * 

<r  (Xj+Xj  x)/2  =  1270  lbs. 

Number  of  standard  deviations  between  nominal  and  specification  limit 
=  1.57. 

Calculate  P  from  table  2  for  j  =  3,  4,  5,  utilizing  operational 
reliability  valuer  of  figure  2.  By  trial  and  error,  equation  (4)  is 
minimized  when  the  decision  limits  are  nominal  +  1.8  standard 
deviations,  and  E(N)  =  24.  6  tests  per  accepted  engine. 

Case  2:  Reliability  =  1.0  (Same  correlation  coefficient  and  standard 

deviations  as  in  case  1) 

It  is  of  interest  to  observe  the  partial  effect  of  reliability  on  the 
optimum  decision  limits  and  expected  number  of  tests,  E(N).  Let  R^ 

through  R  be  1.  0,  Then  utilizing  table  2,  (or  table  1  for  j  =  3,  4,  5) 

D 

calculate  P^.  The  standard  deviations  of  X^  and  (X^+X^)/2  and  correla¬ 
tion  coefficient  are  the  same  as  in  case  1.  Equation  (4)  is  minimized 
when  the  decision  limits  are  nominal  +1.5  standard  deviations  and  E(Nf) 

=  3.6  tests  per  accepted  engine.  In  comparing  these  values  to  those  in 
case  1,  note  that  the  optimum  decision  limits  become  tighter,  and  tne 
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number  of  tests  per  accepted  engine  decreases  as  reliability  increases. 

By  comparison  of  results  in  Model  1,  case  1  to  those  of  Model  2, 

I’PH?  ?,  .  A  1111*11  rtf  tV*  m  offart  rtf  ar  +Via  smnin*  aftaf  0  •rat*sna 

'  -  •  -  -  -  * - - - ri - a  - - ts - - -  "  * - - 

5  tests  is  obtained.  The  optimum  decision  limits  are  nominal  +  1.  7 
standard  deviations  in  the  former,  and  E(N)  =  3.3  tests  per  accepted 
engine;  in  the  latter,  the  optimum  decision  limits  are  nominal  +  1.  5 
standard  deviations,  and  E(N)  =  3.  6  tests. 

APPLICATIONS:  The  minimum  expected  number  of  tests  per 

accepted  engine,  E(N),  provides  a  convenient  yardstick  for  trade-off 
studies.  For  example,  one  might  want  to  determine  whether  or  not 
the  cost  of  overhauling  test  facilities  in  order  to  improve  operational 
reliability  by,  say,  5%,  is  worthwhile,  Or,  one  might  want  to  deter¬ 
mine  whether  the  cost  of  reducing  engine -to -engine  variability  by 
improving  calibration  techniques  or  equipment  is  offset  by  the  reduced 
number  of  tests  required  for  acceptance,  etc. 
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APPENDIX 


I..  xi  +  xi+i! 
COV  l^i* - 2 - I 


xi'  Vxi+1 


'xi'  'xi + xi+l 


Substituting  from  equations  (7)  and  (8),  this  is 


15) 


Pv  y  V  ffy 
*i  *i'  *i+l  *i  *i+l 


X, 


/ 2 

A, 

V  i 


2  2 

^y  *  2p„  X  *x  ^X 
i+1  i*  i+1  i  Xi+1 


Substituting  from  equation  (9) » 


2  1  /  2  2  2  v 

'xi +  8  l\ +  'xi«  '  2Trb  ’ 


2«r  2  +  2or  2  -  Zv  2 
Ai  Ai+1 


(6) 


3<rX2  +  7X  2  '  2<rRR 
Xi  Xi+1  RR 


rx,  V8  *  ’x 


2  2  v 

ffRR 
i+1  RR 


assuming  =  <r,+1  , 
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(6A) 


Xi  +  Xi+1 


or,  letting  <r  =  <rY  in  equation  (5),  this  result  may  be  expressed 
as:  Xi  Xi+1 


(6B) 


+X 


i+1 


1 

V? 


X 


i+1 


Let  *EE 

« 

relation 


be  the  engine -to -engine  variance  component.  Using  the 

V2  2 

°EE  +  ^RR  togeth«r  with  equations  (6A)  and  (8B), 


the  volume  in  the  corners  of  figure  1  are  computed  from  a  bivariate 
normal  distribution  table.  Using  this  result  plus  a  univariate  normal 
table,  P(i)  and  P(i)  as  defined  pgs.  175  and.  177  are  calculated, 


cov(Xi,  Xt  +X1+1  ) 
2 


=  E 


X 


<Xi 


Xi+1> 


-  E(X  )  E  (Xi  *  Xl+1  ) 
2 


2 
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5 


(7) 


E(Xj) 
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-  E(Xt)  E  (Xi+1) 


} 


1 

2 

1 

2 


['x4  +  C0V  ^Xi'  Xi+1 

K  +  %• 


i+1  Ai  Ai+1 


If  «rv  *  Vy.  ,  this  becomes 
i+1 


(7  A) 


X.  +  X.  X, 

cov(X  ,  Y  "  '-)  =  — T~  U  +  Px  x  1 

l  £  £  xi*  xi+i 


(8) 


VXi4i 


’2  ft 


2  2 

+  (Ty  4  2p 

i'  Ai41 


xi' 


If  <r  *  ot. 
V  Xi4l 


(8  A) 


2  VT 


V1  +  ^x  X 

i*  *i4l 


Fromj(6B),  this  becomes 
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(8B) 


X.  +  X.x,  =  rx  p 

1 _ i+1  Ai  PX  .  X.  +X.  . 

— "1  1  x  xti 


from  equation  (9 A),  (8A)  becomes 


(8C) 


xi +  xi« 


*  V 


2  2 
*  “'rr 


Sum  of  squares  run -to -run  is 


SSRR  ‘  1  »  <Xi  '  Xi+1  >2 
a=l  a  a 


N 


2  25 .  KXl  "  ^i*  '  ^Xi+1  “  *  i+1  *  +  "  **1+1^  2‘ 

cL-i  a  a 


Assuming  |x^  =  and  since  MS 


SS 


RR 


SS  N 


^ 4  i  a  <xi-  +  s(xi«-  •*  i«>2  -  ^(xL^i)(x1+1-^l+1)]. 


N 


N 


N 


Taking  expectations,  the  run-to-run  variance  component  is: 


“rR  =  2  C°'X1  +  fX1+1  .  “  2  COV  *Xi’  Xi+1^ 


=  j  [<r  L  +  <r  2  -  2  p  ffX  1 

2  Xi+1  Xi+1  Xi+1 
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X,  X 


i'  "i+i 


2  2-2 
<rv  +  o-v  -  2  <rRn 

"i  -i-t-i _ 

2  fv  ffY 

*1  i+1 


i  a  *x 

xi  'i+1 


xi 


3  1  -  ' 


LIMITS  AND  RELATIONSHIPS: 


*1+l  '  2Pxi'  Xi  +  xi 


When  (Tjuj,  ■  0.  Px^  -  q.  when  crRR  =  0,  p  =  I.  0  . 


V  Xi+XW 


\  xi« 


1.00 
.99 
.98 
.97 
.96 
.95 
..  90 
.85 
.80 
.75 

I//2  =  .707 


FIGURE  4 


THE  THEORETICAL  STRENGTH  OF  TITANIUM 
CALCULATED  FROM  THE  COHESIVE  ENERGY 


Perry  R.  Smoot 
Research  Physical  Metallurgist 
U.  S.  Army  Materials  Research  Agency 
Watertown,  Massachusetts 


ABSTRACT.  The  derivation  of  the  equation  for  theoretical  strength 


max 


was  reviewed,  and  certain  assumptions  made  were 


considered  questionable.  Therefore,  the  accuracy  of  the  equation  was 
considered  questionable,  and  a  new  method  for  calculating  the  theoretical 
strength  of  crystals  was  derived,  utilizing  the  Morse  potential  equation 
and  the  cohesive  energy.  The  theoretical  strength  of  titanium  calculated 
by  thiB  method  was  3.  28  x  10^  psi.  .  'j' 


I.  INTRODUCTION,  The  U.  S.  Army  Materials  Research  Agency 
is  engaged  in  the  development  of  strong,  lightweight  titanium  alloys'for 
use  in  Army  weapons.  In  the  course  of  this  investigation,,  a  question  " 
arose  concerning  the  maximum. strength  theoretically  obtainable,  .and",, 
how  it  might  be  obtained.  It  has  been  proposed  for  some  time  that  thie  '  V 
theoretical  strength  of  metals  is  considerably  higher  than  that  normally 
observed,  Polanyi*  presented  a  method  of  calculating  theoretical,  strength, 
in  which  the  bonding  force  between  atoms,  as  a  function  of  internuclear , 
separation,  was  taken  to  be  approximately  as  shown  in  Figure  1,  As  a 
brittle  crack  progressed  through  the  crystal  in  the  manner  shown  in 
Figure  2,  the  interatomic  bonds  were  extended  and  broken,  and  the  work 
done  against  each  bond  was  equal  to  the  area  under  the  curve  from  r 
to  r  in  Figure  1.  As  the  brittle  crack  progressed,  new  surfaces 
were  created  having  a  surface  energy  of  2S,  and  this  energy  was  assumed 
equal  to  the  work  done  against  the  interatomic  bonds.  On  this  basis, 
an  algebraic  solution  for  ffmax,  the  maximum  theoretical  stress  at 
fracture  was  obtained,  as  follows: 


where  E  =  the  elastic  modulus 
S  =  the  surface  energy 
r  =  the  equilibrium  atomic  spacing 
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The  value  of  <r  was  ordinarily  expected  to  be  of  the  order  of  rrr , 

ma*  6  2  10 
which  wa*  about  1  to  10  x  10  pal  for  most  common  metals  .  Indeed,  very 

high  strength  values  of  this  order  of  magnitude  have  been  obtained  in 

metallic  and  ceramic  whiskers. 

In  the  deviation  of  this  expression,  the  following  energy  balance 
is  assumed; 


Energy  to  fracture  bonds  «  surface  energy  of  new  surface 

created  by  fracture  . 


No  theoretical  basis  for  this  assumption  is  apparent,  since  the  mechanism 
by  which  surface  energy  arises,  and  its  relationship,  if  any,  to  the  energy 
required  to  fracture  the  interatomic  bonds  is  not  known.  Consequently, 
this  expression  for  the  theoretical  strength  is  considered  to  be  of  question¬ 
able  accuracy. 

Another  questionable  aspect  in  the  derivation  of  this  method  is  the 
assumption  that  the  interatomic  force  vs,  displacement  curve  (Figure  1) 
is  sinusoidal.  In  addition,  there  is  a  practical  difficulty  in  calculating 
theoretical  itrengths  by  this  method,  since  surface  energy  values  for 
solids  are  not  available  for  most  elements  (including  titanium). 

Because  of  these  questions  as  to  the  correctness  of  this  method, 
and  because  of  the  lack  of  surface  energy  data,  it  was  desired  to  discover 
another  method  for  calculating  theoretical  strength  in  which  more  confi¬ 
dence  could  be  felt,  perhaps  by  some  means'! involving  computation  of  the 
actual  forces  between  atoms. 

IX.  CALCULATIONS.  As  a  result  of  an  inquiry,  Dr.  R.  J.  Weiss3 
suggested  a  method  for  calculating  the^theoretical  strength  of  metals  by 
means  of  the  Morse  potential  equation  : 


V  =  D 


-2a(r-re) 
e  -2e 


-a(r-re) 


where  V(Ev)  =  potential  energy 

D('ftom)  =  co*iea*ve  energy  (the  heat  of 
vaporization,  AH^.)  per  atom 


<dOb 
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) 


r(A) 


a  constant 


the  internuclear  separation 
the  equilibrium  separation 


This  equation  related  the  bonding  energy  of  two  hydrogen  atoms  to  the 
internuclear  separation,  as  shown  in  Figure  3. 

The  energy  values  given  by  this  function  agreed  well  with; 

x 

a.  experimental  values  of  bonding  energy  va  separation  for  the  H 
molucule  (except  under  compression;  see  Table  I)5, 

b.  theoretical  values  of  bonding  energy  calculated  in  a  few  cases 
by  quantum  mechanics,  6 

c.  experimental  values  for  compressibility  for  sixteen  metals 
(see  Figure  4)7. 

It  was  therefore  considered  reasonable  to  assume  that  this  relation¬ 
ship  applied  between  the  atoms  in  a  crystal,  with  a  modification  consisting 
of  multiplying  the  cohesive  energy  by  a  factor  to  take  into  account  the 
division  of  the  cohesive  energy  between  nearest  neighbors  in  a  crystal, 
Differentiation  of  this  equation  yielded  an  expression  for  interatomic 
force,  and  it  was  suggested  that  this  be  used  to  calculate  the  maximum 
fbrce  between  atoms  and  the  theoretical  strength  of  metals.  The  accuracy 
of  the  calculations  was  expected  to  be  within  a  factor  of  two  at  worst, 
and  probably  much  better. 

As  mentioned  above,  it  was  necessary  to  multiply  the  cohesive 
energy  by  a  factor  to  take  into  account  its  division  between  N  nearest 
neighbors.  The  energy  contributed  by  each  atom  to  each  interatomic 
bond  was  4-.D.  Since  this  contribution  was  made  by  one  atom  at  each 

N  2 
end  of  each  bond,  the  total  energy  of  one  bond  was  rjD.  Therefore,  the 

2  N 

cohesive  energy,  D,  was  multiplied  by  when  the  Morse  potential  equa¬ 
tion  was  applied  to  a  crystal. 


The  theoretical  strength  of  a  titanium  crystal  was  calculated  by 
means  of  the  method  suggested  above  by  Dr.  Weiss.  The  Morse  potential 
equation  for  a  crystal  was  differentiated  as  follows; 


V(Ev) 


in  [.  -a.  'l(r-r«)] 

„  r  -2ar  +  2ar  -ar  +  ar  "I 

S»  [•  e-2'  *J 

-2ar  2ar  .  -ar  ar 
2  _  e  4  _  e 

KD=  a  -NDe 


(Morse  potential 
equation  for  a 
crystal) 


This  equation  was  similar  to: 
bx 

y  =  k  e  ’  where  k  and  b  are  constants. 

Since  4^  =  b  k  e  ')X'  then 
dx 


.  ■  E  2ar  -  -2ar  ar  -ar 

-j—  =  F(interatomic  force,  ——■)  =  -2ae  6  r.  De  +  2ae  6  ^.De 
dr  o  N  N 

A 


4  _ 

NDa  ' 


2a(re-r)  *( 
+  e 


r.-r>  j 


—  ^  »  K(interatomic  spring  constant, 


_  2ar  -2ar  -  ar 

,  2  e_  ,2  e* 

4a  e  2  _  -2a  e  2  _ 

NDe  NDe 


K  =  lea2 


2a(r  -r)  a(r  -r) 
2e  *  -e  e 
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It  should  be  noted  that  when  r  =  re, 


and  when  K  =  0 

(4) 


K  =  |D.2 


2ar 


-2ar 


ar 


-ar 


.  2 

4a  e 


—  De 
N 


=  2a2e 


I- 


ar  -ar 
2e  ee  =1 


1 


ar 


ar 
2e  * 

ar 


ear  =  2e  e 


ar  In  e  =  ln2  +  ar  In  e 
e 


ln2  A  • 
ar  =  *.—■■■-  +  ar 
In  e  e 


and 

(5) 


In?,  .  .693 

r  =  r  +  — . e  r  +  - - 

e  alne  e  a 


In  order  to  utilize  the  Be  equations  to  calculate  the  theoretical 
strength  of  a  crystal,  it  was  necessary  to  obtain  numerical  values  for 
a,  D,  and  r  .  Values  for  D  and  r  were  found  in  the  literature  as 
follows:  *  e 


D 


4.74 


atom 


r 

e 


o9 

2. 95  A 
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The  value  "a"  was  determined  from  the  elastic  modulus  of  the  crystal. 

The  structure  of  titanium  is  hexagonal  close  packed,  and  was  considered 
to  be  an  assembly  of  unit  tensile  cells,  as  shown  in  Figure  5.  As  a 
hypothetical  stress  was  applied  to  the  crystal  in  the  [  100  ]  direction, 
an  elastic  strain  occurred,  as  shown  in  Figure  6,  so  that  bonds  between 
atoms  in  the  plane  of  atom  A  such  as  bond  AS  were  extended,  but  bonds 
such  as  AT  and  AJ  were  not.  Also,  the  bonds  between  atoms  in  the 
plane  of  atom  A  and  the  atoms  above  and  below  this  plane  were  not 
extended.  For  example,  bonds  AG,  AV  and  AO  (Figure  7)  were  not 
extended,  Trigonometric  calculation  demonstrated  that  this  was  possible 
without  physical  interference  between  atoms.  As  a  result,  the  spring 
constant  of  the  unit  tensile  cell  was  equal  to  the  spring  constant,  K,  of 
the  single  interatomic  bond,  AS.  A  numerical  value  for  K  was  determined 
from  Young's  modulus,  E,  as  follows: 

o 

E  =  17.85  xl06psi  =  7.  67  x  10_1 

A 

7.  67  x  10*1  x  ~~  8  16.  22  x  10-1  =  K 

Z*95  A 

o2 

where  6.  24  *  transverse  area  of  unit  tensile  cell,  A 

o 

2.95  a  length  of  unit  tensile  cell ,  A 


From  equation  (4),  above 

K  =  r,Da2  =  16. 22  x  lO”1 
N 

and 

i  s  1.014  i 
A 

Having  numerical  values  for  all  the  constants  required,  the  Morse 
potential  energy  and  interatomic  force  were  calculated  as  a  function  of 
internuclear  displacement,  r,  and  the  results  are  shown  in  Figure  8. 

It  will  be  noted  that  as  a  stress  was  applied  and  the  interatomic  bond 
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extended,  the  maximum  force  was  reached  at  the  internuclear  separation 
dF 

for  which  r—  =  0.  The  internuclear  separation  at  this  point  was  cal- 
dr 

culated  from  the  equation:  u 


.  603 

r  -  r  +  - -  (equation  5,  above) 


This  separation  was  3.  63A,  and  the  corresponding  force  was  0.402  -5— 

(0,  645  x  10  4  dynes).  ^ 

This  data  was  then  used  to  calculate  the  theoretical  strength  of  a 
titanium  crystal.  A  tensile  force  was  considered  to  be  applied  in  the 
[001]  direction  in  the  crystal,  an  equal  part  of  which  was  transferred 
to  each  atom  in  the  (001)  plane  (see  Figure  9).  These  tensile  forces  on 
each  atom  are  represented  by  the  vectors  AB  and  AF.  The  tensile 
vector  AB  is  the  resultant  of  the  vectors  AC,  AD,  and  AE.  These  vectors 
arise  from  the  extension  of  the  interatomic  bonds  AC,  AD  and  AE.  As 
the  applied  tensile  force  increased,  the  bonds  extended  until  they  reached 
the  fracture  extension;  and  the  force  in  each  bond  increased  to  equal  the 
fracture  force  mentioned  above.  As  the  applied  tensile  increased  still 
further,  the  bonds  exceeded  the  fracture  separation  and  the  force  in 
each  one  decreased  in  accordance  with  Figure  8.  Thereafter,  the  sum 
of  the  bond  forces  was  less  than  the  applied  force,  so  that  the  applied 
force  was  able  to  fracture  the  crystal  on  the  (001)  plane. 

The  theoretical  maximum  strength  was  attained  immediately 
prior  to  fracture.  The  stress  on  the  crystal  at  this  time  was  determined 
by  calculating  the  magnitude  of  the  vertor  AB  and  dividing  it  by  the  area 
per  atom  in  the  (001)  plane.  The  value  so  calculated  was  3,  28  x  10 ^  psi. 

It  should  be  noted  that,  in  this  theory  of  fracture,  it  was  assumed  that 
no  slip  occurred  and  no  brittle  crack  propagated  through  the  crystal  at 
a  stress  lower  than  the  theoretical  maximum. 

A  calculation  of  the  theoretical  strength  of  a  monatomic  titanium 
filament  was  made  by  means  of  equations  1  and  2,  letting  N  =  2,  and  the" 
theoretical  maximum  strength  of  a  monatomic  filament  was  found  to  be 
8,200,000  (8.  2  x  10^)  psi,  which  was  considerably  higher  than  for  the 
crystal.  This  increased  strength  was  due  to  greater  cohesive  energy 
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per  bond,  since  there  were  fewer  nearest  neighbors.  The  strength  of  a 
monatomic  sheet  would  be  above  that  of  the  crystal  and  bejow  that  of  a 
filament,  due  to  the  same  effect. 

Using  this  method,  the  theoretical  cleavage  strength  of  iron  in  the 
[100]  direction  was  calculated  to  be  12.  7  x  10^  psi.  It  is  interesting  to 
compare  this  value  with  the  observed  yield  strength  of  iron  whiskers  in 
the  [100]  direction,  which  is  .  664  x  10^  psi*®.  The  strength  of  the 
whiskers  is  considerably  higher  than  normally  observed  in  iron  and  iron- 
base  alloys,  showing  that  the  material  is  capable  of  much  higher  strength 
than  it  normally  exhibits.  On  the  other  hand,  the  whisker  yield  strength 
is  considerably  less  than  the  calculated  cleavage  strength,  due  to  the  onset 
of  plastic  flow.  If  plastic  flow  could  be  prevented  by  some  means,  it  is 
possible  that  the  high  cleavage  strength  predicted  by  the  calculations 
could  be  attained. 

III.  DISCUSSION.  The  theoretical  strength  calculated  for  titanium, 
3.  28  x  10^  psi,  is  much  larger  than  the  normally  observed  strength  of 
about  200,000  psi.  Greater  confidence  is  felt  in  this  method  of  calculating 
theoretical  strength  than  in  the  method  of  Polanyi,  because  of  the  ques¬ 
tionable  aspects  of  the  Polanyi  deriviation  mentioned  in  the  Introduction, 

High  theoretical  strengths  may  be  obtainable  in  real  materials  if 
the  necessary  conditions  can  be  maintained,  namely,  that  no  slip  occur 
and  no  brittle  crack  propagate  at  a  stress  below  the  theoretical  maximum 
strength,  The  method  of  obtaining  these  conditions  is  problematical,  and 
several  possible  methods  may  be  suggested.  One  method  might  be  to 
grow  whiskers  of  such  perfection  that  no  slip  would  occur  and  no  brittle 
crack  would  propagate  until  a  stress  level  approaching  the  theoretical 
were  reached.  Another  method  might  be  to  simply  produce  very  fine  fila¬ 
ments  (not  necessarily  whiskers)  by  some  method,  such  as  drawing  from 
the  melt.  Slip  in  these  filaments  might  be  inhibited  by  alloying  elements 
and  brittle  crack  formation  supreseed  by  the  small  size.  (There  is  some 
evidence  that  in  filaments,  the  tendency  for  brittle  cracks  to  nucleate 
and  propagate  is  suppressed  by  decreasing  the  diameter. )  Slip  might 
also  be  inhibited  by  suitable  control  of  crystal  orientation,  the  production 
of  very  fine  grain  sizes,  or  by  amorphous  (vitreous)  structures.  The 
high  strengths  calculated  for  monatomic  filaments  and  sheets  may  be 
approached  if  these  or  similar  structures  can  be  produced.  There  is 
some  hope  that  these  high  strength  levels  may  be  attained  in  metals  such 
as  titanium,  since  strength  levels  of  3  x  10^  psi  have  already  been 
achieved  with  graphite  whiskers**, 
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IV.  RECOMMENDATIONS  FOR  FURTHER  WORK.  Since  this  area 
of  study  offers  considerable  promise  for  the  development  of  ultra-high 
strength  materials,  it  is  suggested  that  further  work  be  undertaken  to 
further  develop  the  theory  of  the  strength  of  solids,  verify  it  experimentally, 
and  find  methods  of  applying  it  to  the  production  of  engineering  materials 
having  these  high  strengths.  From  a  survey  of  the  literature,  it  appears 
that  further  developments  are  necessary  in  the  methods  of  quantum 
mechanics  so  that  more  accurate  calculations  of  the  energy  vs,  internuclear 
separation  may  be  made.  Experiments  are  required  to  verify  the  results 
obtained  and  to  provide  data  for  engineering  application. 

The  materials  offering  the  best  combination  of  properties  should  be 
identified,  and  developmental  programs  initiated  to  establish  methods  of 
providing  high  strength  engineering  materials  at  acceptable  cost  and  in 
the  quantities  required. 

AC  KNOW  LEDGEMEN  T .  The  author  is  indebted  to  Dr.  R.  J.  Weiss 
of  the  U.  S.  Army  Materials  Research  Agency  for  providing  the  theoretical 
physical  basis  and  guidance  on  which  these  calculations  were  based. 
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TABLE  I 


Bonding  Energy  and  Interatomic  Spacing  for  H 


+ 

2 


(Energy  Unit;  Rydbergs) 


V 

V 

r 

Exact 

Calculated* 

0.2 

7.1426 

0.9654 

0.4 

2. 3984 

0. 5821 

0.6 

0.9903 

0. 3093 

0.  8 

0.  3910 

0.1184 

1.0 

0.0964 

-0.  0122 

1.2 

-0.0579 

-0.0989 

1.4 

|‘ 

-0.1399 

-0.1536 

1.  6 

V 

-0.1819 

-0.1854 

1.8 

-0.2005 

-0. 2010 

2.0 

-0.2053 

-0. 2053 

2.2 

-0.2017 

-0.2020 

2.4 

-0.1931 

-0.1937 

2.  6 

-0.1817 

-0.1824 

2.8 

-0.1687 

-0.1693 

3.0 

-0.1551 

-0.1556 

3.2 

-0.1415 

-0.1417 

3.4 

•0.1282 

-0.1282 

3.6 

-0.1154 

-0.1153 

3.8 

-0.1034 

-0.1033 

4.0 

-0.0922 

-0. 0922 

5.0 

-0.0489 

-0.0502 

6.0 

-0. 0240 

-0.0264 

7.0 

-0.1002 

-0. 01? 6 

8.0 

-0.0051 

-0.  0070 

9.0 

-0. 0024 

-0.  0036 

■“Calculated  by  means  of  the  Morse  potential  equation. 


Credit:  From  Quantum  Theory  of  Molecules  and  Solids, 
by  J.  C.  Slater,  Copyright  (c)  1963  by  the 
McGraw  Hill  Book  Co.  ,  Inc.  Used  by  permission 
of  McGraw  Hill  Book  Co. 
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TEN  SNAKE  VENOMS:  A  STUDY  OF  THEIR  EFFECTS 
ON  PHYSIOLOGICAL  PARAMETERS  AND  STTRyiVAL 


James  A.  Vick,  Henry  P.  Ciuchta,  and  James  H.  Manthei 
Neurology  Branch,  Experimental  Medicine  Department 
Medical  Research  Laboratory,  U.S.  Army  Chemical  and  Research  Labe, 

Edgewood  Arsenal,  Maryland 
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The  poisonous  snakes  and  the  venoms  they  produce  have  both  fascinated 
and  confounded  the  scientific  world  for  a  number  of  centuries.  In  the  past 
inaccurate  and  incomplete  descriptions  of  the  physiologic  changes  observed 
following  envenomation  have  aided  the  advance  of  folklore  concerning  the 
venomous  snake.  Even  now  there  are  numerous  conflicting  reports  con¬ 
cerning  the  mechanism  by  which  these  venoms  produce  their  lethality.  It 
is  no  small  wonder,  therefore,  that  because  of  these  reports  many  mis¬ 
givings  and  misconceptions  concerning  the  snake  and  its  lethal  and/or 
incapacitating  capabilities  have  arisen, 

With  this  background  in  mind  a  study  was  designed  to  determine  the 
exact  sequence  of  physiologic  changes  which  follow  the  injection  of  a  lethal 
dose  of  snake  venom.  Precise  data  concerning  the  minimum  lethal  dose 
for  each  of  ten  venoms  was  also  determined,  as  well  as  a  comparison  of 
relative  potencies  in  the  mouse  and  dog. 

MATERIALS  AND  METHODS.  The  snake  venoms  used  in  these 
experiments  were  obtained  commercially  from  the  Miami  Serpentarium, 
Miami,  Florida  and  from  the  Medical  Research  Laboratory  at  Ft.  Knox, 
Kentucky.  Each  venom,  which  was  collected  by  inducing  the  snake  to 
strike  a  rubber  covered  jar,  was  mucous  free  and  devoid  of  cellular  debris. 
Bacteria  were  removed  by  high-speed  centrifugation  and  the  supernatant 
liquid  was  lyophilized.  Ten  venoms,  representing  three  families  of  snakes, 
were  studied.  These  were  as  follows: 

Family  -  Crotalldae 


1.  Crotalus  Adamanteus . Eastern  Diamondback  Rattlesnake 

2.  Agkistrodon  Piscivorua  .......  Cottonmouth  Moccasin 


3.  Crotalus  Atrox . Western  Diamondback  Rattlesnake 

4.  Agkistrodon  Contortix 

Contortix . Southern  Copperhead 


l 
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5.  Agklstrodon  Contortix 

Moke  son .  Northern  Copperhead 

« 

Family  -  EUpidae 


1.  Bungs rus  Caeruleua .  Indian  Krait 

2.  Naja  Naja .  Indian  Cobra 

3.  Micrurus  Fulviua .  Coral  Snake 
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.  Family  -  Vlgeridae 

1.  Vipers  Russelli  . . .  .  Rue  tell 'a  Viper 

^  •  ••  ..  . 

2.  Bitis  Arietana  . .  .  Puff  Adder 


Initial  toxicity  of  the  ten  venoms  was  determined  using  a  total  of  1864 
mice.  Juat  prior  to  administration;  the  lyophilized  venom  was  dissolved 
in  normal  saline  (1.  0  mg/ml)  for  intravenous  injection  into  the  dorsal 
tail  vein  of  the  mouse.  Table  1  shows  the  number  of  mice  used  to 
establish  the  and  time  to  death  for  each  venom, 

Eighty  adult  mongrel  dogs,  anesthetized  with  30  mg/kg  pentobarbital 
sodium,  were  employed  in  the  second  phase  of  this  study.  Arterial  blood 
pressure  was  monitored  using  a  polyethylene  catheter  inserted  into  the 
femoral  artery  and  connected  to  a  statham  pressure  transducer  and  an 
E  and  M  physiograph  recorder.  Portal  venous  pressure  was  recorded 
via  a  catheter  inserted  into  the  splenic  vein  and  advanced  into  the  portal 
circulation.  Respiratory  rate ,  electrocardiogram  (EKG),  and  heart 
rate  were  continuously  monitored  jsing  a  pair  of  needle-tip  electrodes 
placed  in  either  side  of  the  chest  wall,  and  connected  to  the  E  and  M 
physiograph  through  appropriate  preamplifiers. 

Cortical  electrical  activity  (ifiEG)  was  monitored  using  four  unipolar 
silver  electrodes  placed  directly  on  the  dura  of  each  hemisphere  of  the 
brain  and  connected  to  hi-gain  preamplifiers.  Two  of  the  electrodes  were 
placed  in  the  frontal  area  and  two  in  the  occipital  region  of  the  brain. 
Continuous  recordings  of  EEG  were  made  on  a  Model  5  grass  polygraph. 
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The  LD^,  as  well  as  the  approximate  time  to  death,  was  determined 

for  each  venom.  All  data  were  statistically  evaluated  using  standardized 
procedures  (1). 

Evisceration  was  performed  in  20  dogs  to  determine  if  the  initial 
fa^l  in  blood  pressure  observed  following  venom  administration  was  due 
to  the  pooling  of  blood  in  the  hepato  -  splanchnic  bed.  The  surgical 
evisceration  procedure  was  carried  out  as  follows:  the  celiac,  superior 
and  inferior  mesenteric  arteries  were  ligated,  and  the  stomach,  spleen 
and  intestines  were  removed  after  ligation  of  the  esophagus  and  sigmoid 
colon.  The  portal  vein  was  also  ligated  and  sectioned.  Blood  flow  to  the 
adrenal  glands  and  kidneys  was  carefully  preserved  and  not  impaired  by 
the  procedure. 

Vagotomy  was  performed  through  an  incision  made  at  the  level  of 
the  larynx.  Both  vagi  were  cut  following  a  careful  clissection  from  the 
tissues  surrounding  the  carotid  artery,  A  recovery  period  of  60  minutes 
was  allowed  before  venom  was  administered. 

RESULTS.  The  intravenous  mouse  LD^  with  95  per  cent  confidence 

limits  for  each  of  the  ten  venoms  is  shown  in  Table  1  [Figures  and  tables 
have  been  placed  at  the  end  of  this  article.  ]  Comparative  potencies  for 
each  venom  are  also  graphically  displayed  in  Figure  1.  It  is  to  be  noted 
that  the  most  lethal  venom  (Indian  Krait)  is  approximately  one  hundred 
times  more  potent  than  the  venoms  of  either  the  Northern  or  Southern 
Copperhead.  Also  shown  in  Table  1  is  the  average  time  to  death  for 
each  venom.  There  appears  to  be  no  clear  relationship  between  potency 
and  survival  in  that  the  most  potent  venom  (Krait)  has  the  longest  sur- 
vival  time. 

The  I.  V.  LD^  of  each  venom  in  the  dog  is  shown  in  Table  2.  Average 

time  to  death  is  also  indicated  for  the  ten  venoms.  Comparative  potencies 
for  all  venoms  are  presented  in  Figure  2.  In  general,  this  data  indicates 
that,  on  a  mg/kg  basis,  the  lethal  dose  of  each  venom  in  the  dog  is 
significantly  less  than  the  corresponding  lethal  dose  of  venom  in  the 
mouse  (p<.  05).  This  is  not  entirely  true,  however,  because  the  lethal 
doses  of  Russell's  Viper  and  Coral  snake  venoms  are  nearly  identical 
in  both  the  dog  and  mouse  (p>.  05).  Relative  potencies«of  the  venoms  are 
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quite  similar  in  that  the  venom  which  is  the  most  potent  in  the  dog  is  also 
the  one  that  is  most  potent  in  the  mouse.  Likewise,  the  venom  which  is  the 
least  potent  in  the  dog  is  also  the  one  that  is  the  least  potent  in  the  mouse. 

The  specific  effects  of  lethal  Injections  of  each  venom  on  EEG,  EKG, 
heart  rate,  respiration  and  blood  pressure  in  the  dog  are  shown  in 
Figures  3-12: 

Figure  3,  Following  a  lethal  injection  of  Eastern  Dlamondback 
rattlesnake  venom,  there  occurred  a  precipitous  fall  in  arterial  blood 
pressure  and  a  marked  narrowing  of  the  pulse  pressure.  This  was 
followed  at  from  8  to  10  minutes  by  partial  recovery  of  blood  pressure 
to  near  normal  levels  and  an  increase  in  pulse  pressure.  Finally,  just 
prior  to  death  arterial  pressure  once  again  decreased  sharply,  terminat¬ 
ing  with  cardiac  arrest. 

- . 

Respiration  appeared  unaffected  during  the  first  2-5  minutes  post 
injection  at  which  time  an  abrupt  cessation  in  ventilation  occurred. 

Changes  in  EKG  observed  after  the  injection  of  the  venom  were  cdnsistant 
with  progressive  cardiac  anoxia.  Fast  tracing  during  the  post  Injection 
period  showad  depression  of  the  ST  segment,  inversion  of  the  spike 
segment,  and  finally,  overwhelming  cardiac  hypoxia. 

This  venom  produced  marked  bradycardia  immediately  post  injec¬ 
tion  becoming  progressively  severe  until  just  prior  to  death.  At  this  time 
an  anoxic  tachycardia  was  observed  leading  to  ventricular  fibrillation. 

Within  3-5  minutes  after  injection  a  complete  loss  of  all  cortical 
electrical  activity  occurred.  This  change  was  Irreversible  and  appeared 
to  occur  prior  to  depression  of  respiratory  movements. 

Evisceration,  for  the  most  part,  prevented  the  sharp  fall  in  arterial 
blood  pressure  and  the  decrease  in  heart  rate  observed  in  the  intact  dog. 
Instead,  a  very  moderate  decrease  in  blood  pressure  occurred  with  an 
associated  increase  in  heart  rate. 

Bilateral  vagotomy  did  not  prevent  the  drop  in  blood  pressure  but 
did  allow  for  an  increase  in  heart  rate  following  a  lethal  injection  of 
Eastern  Diamondback  rattlesnake  venom. 
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Figure  4.  The  venom  of  the  Cottonmouth  moccasin  also  produced  a 
precipitous  fftli  in  «rt»yiai  blood  pressure,  he ”■  ever ,  ir.  increase  in  pulse 
pressure  rather  than  a  decrease  was  noted.  This  was  followed  by  partial 
recovery  at  from  3-5  minutes  and  a  subsequent  decline  in  both  arterial 
and  pulse  pressures.  Just  prior  to  death  a  second  marked  increase  in 
both  arterial  pressure  and  pulse  pressure  occurred.  This  appeared  to 
be  due  to  depressed  respiratory  movements  and  generalised  cardiovascular 
hypoxia.  Respiration  was  temporarily  interrupted  after  injection  of  venom. 
This  was  followed  by  partial  recovery  and  a  subsequent  decrease  in  both 
rate  and  volume  over  the  following  10-20  minutes,  leading  to  complete 
apnea.  No  significant  changes  in  EKG  were  noted  until  severe  respiratory 
embarrassment  became  apparent  at  which  time  changes  consistent  with 
generalized  myocardial  hypoxia  appeared.  Likewise,  heart  rate  was 
only  slightly  affected  by  this  venom  until  time  of  apnea  when  terminal 
tachycardia  was  noted.  This  was  followed  by  cardiac  arrest. 

No  significant  changes  in  cortical  electrical  activity  were  noted 
immediately  following  the  injection  of  venom.  Only  after  prolonged 
hypotension  were  alterations  in  EEG  noted.  At  time  of  apnea  complete 
electrical  silence  was  observed. 

Evisceration  did  not  prevent  the  precipitous  decrease  in  blood  pres¬ 
sure  or  the  bradycardia  produced  by  this  venom.  In  contrast,  vagotomy 
allowed  for  an  increase  in  heart  rate  as  the  blood  pressure  fell  following 
the  injection  of  the  venom. 

Figure  5.  The  venom  of  the  Western  Diamondback  rattlesnake 
produced  a  less  dramatic  fall  in  arterial  blood  pressure.  Pulse  pressure 
increased  initially  and  returned  to  normal  as  blood  pressure  recovered. 

No  anoxic  rise  in  blood  pressure  was  observed  at  any  time  prior  to  death. 
Rather,  a  slow  progressive  decline  in  both  arterial  and  pulse  pressures 
occurred  during  the  10-15  minutes  proceeding  cardiac  arrest.  Respiration 
was  not  significantly  affected  by  the  venom  during  the  first  10  minutes 
post  injection,  however,  an  abrupt  decrease  in  both  respiratory  rate  and 
volume  was  noted,  at  approximately  15-20  minutes  which  quickly  lead  to 
complete  cessation  of  respiration. 

With  this  venom  the  EK.G  was  relatively  normal  until  the  time  at 
which  both  apnea  and  severe  hypotension  became  prominent.  When  this 
occurred  changes  consistent  with  cardiac  hypoxia  were  noted.  The  only 
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alteration  in  heart  rate  noted  following  injection  of  venom  was  a  terminal 
bradycardia  which  occurred  at  time  of  cardiovascular  collapse. 

A  decrease  in  cortical  electrical  activity  was  observed  following 
Western  Diamondback  rattlesnake  venom  and  occurred  prior  to  any  signifi¬ 
cant  alterations  in  normal  physiologic  function.  This  change  in  cortical 
activity  progressed  to  complete  electrical  silence  just  prior  to  death. 

Evisceration  partially  prevented  the  decrease  in  arterial  blood 
pressure  observed  with  this  venom  and  allowed  for  an  increase  in  heart 
rate. 

Vagotomy  also  eliminated' the  post  venom  bradycardia  but  did  not 
prevent  the  sharp  fall  In  blood  pressure. 

Figure  6.  A  lethal  injection  of  Northern  Copperhead  venom  produced 
an  unusually  sharp  fall  in  arterial  blood  pressure  and  a  remarkable 
increase  in  pulse  pressure.  Arterial  pressure  remained  at  a  very  low 
level  (30-40  mm  Hg)  until  respiratory  arrest  occurred  at  which  time  an 
anoxic -induced  hypertension  and  subsequent  cardiovascular  failure 
occurred.  This  entire  sequence  of  events  required  a  total  of  from  8-12 
minutes.  Complete  respiratory  arrest  occurred  approximately  2-1/2 
minutes  after  the  injection  of  the  venom.  Changes  in  EKG  and  heart 
rate  were  observed  only  after  prolonged  apnea.  This  also  is  true  for 
the  change  in  cortical  electrical  activity.  Loss  of  EEG  appeared  due 
primarily  to  prolonged  cerebral  hypoxia.  Average  time  to  death  with 
this  venom  was  approximately  2  hours. 

Evisceration  did  not  significantly  alter  the  changes  in  heart  rate 
and  blood  pressure  observed  in  the  intact  dog. 

Vagotomy  did  allow,  however,  for  an  increase  in  heart  rate  as 
arterial  pressure  fell  following  injection  of  venom. 

Figure  7.  Southern  Copperhead  snake  venom  produced  changes  in 
the  dog  similar  to  those  observed  with  the  venom  of  the  Northern  Copper¬ 
head.  A  precipitous  fall  in  arterial  blood  pressure  occurred  with  an 
associated  increase  in  pulse  pressure.  At  5-10  minutes  post  injection 
pulse  pressure  narrowed  as  arterial  pressure  increased  slightly.  No 
significant  changes  in  respiration,  EKG,  heart  rate  or  EEG  were  noted 
during  the  initial  post  injection  period.  Progressive  respiratory 
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depression  was  noted  at  from  30-60  minutes,  terminating  in  apnea  and  a 
subsequent  cardiovascular  collapse.  With  this  venom  a  slow  progressive 
decline  in  cortical  electrical  activity  was  observed  which  occurred  prior 
to  any  significant  change  in  respiration.  Time  to  death  was  approximately 

1  -  1-1/2  hours. 

The  effect  of  evisceration  and  vagotomy  was  identical  to  that  observed 
with  Northern  Copperhead. 

Figure  8.  The  venom  of  the  Indian  Krait  produced  a  gradual  decrease 
in  arterial  blood  pressure  with  little  or  no  change  in  pulse  pressure. 
Arterial  pressure  returned  to  normal  at  from  5-15  minutes  and  remained 
stable  until  the  final  anoxic  rise  and  abrupt  decline  at  death.  Respiration 
remained  affected  by  the  venom  until  approximately  20-30  minutes  post 
injection  at  which  time  a  decrease  in  amplitude  but  not  rate  of  respiration 
was  observed.  No  significant  change  in  heart  rate  or  EKG  were  observed 
at  any  time  prior  to  cessation  of  respiration.  Cortical  electrical  activity 
also  remained  essentially  normal  following  Indian  Krait  venom,  decreas¬ 
ing  abruptly  only  after  prolonged  apnea  and  following  the  onset  of 
cardiovascular  difficulties.  Average  time  to  death  for  this  group  was 

2  hours. 

Evisceration  partially  prevented  the  sharp  fall  in  arterial  pressure 
but  did  not  affect  the  profound  bradycardia  observed  in  the  intact  dog. 

Following  vagotomy  no  significant  decrease  in  arterial  blood  pressure 
was  noted,  The  bradycardia  previously  noted  in  the  intact  and  eviscerated 
animals  was  eliminated  by  vagotomy,  being  replaced  by  an  actual  increase 
in  heart  rate. 

Figure  9.  A  lethal  injection  of  Indian  Cobra  venom  produced  an 
immediate  fall  in  arterial  blood  pressure  and  a  narrowing  of  the  pulse 
pressure.  This  was  followed  by  a  progressive  increase  in  both  arterial 
and  pulse  pressure  to  near  normal  levels  reaching  maximum  recovery 
at  from  20-25  minutes.  With  the  onset  of  respiratory  paralysis  a  sharp 
rise  in  both  pressures  was  noted  which  terminated  in  cardiovascular 
collapse  and  death.  The  effect  of  cobra  venom  on  the  respiratory 
mechanism  of  the  dog  has  previously  been  described  in  great  detail. 

This  study  confirmed  previous  results  in  that  there  was  a  slow  progres¬ 
sive  decrease  in  respiratory  rate  and  volume  with  complete  arrest  at 
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approximately  20-30  minutes  post  injection.  Heart  rate  and  EKG  were  not 
markedly  affected  by  the  venom  until  respiratory  arrest  at  which  time 
terminal  anoxic  changes  were  observed.  A  remarkable  change  in  cortical 
electrical  activity  was  noted  following  administration  of  the  cobra  venom. 
Within  30-60  seconds  there  was  a  complete  and  irreversible  loss  of  all 
cortical  electrical  activity  resulting  in  an  isoelectric  EEC  tracing. 

The  initial  fall  in  arterial  pressure  and  decrease  in  heart  rate  was 
completely  prevented  by  surgical  evisceration.  Instead,  a  marked 
increase  in  heart  rate  followed  the  administration  of  the  venom  and 
occurred  ae  blood  pressure  fell  slowly  over  the  entire  observation  period 
of  1  to  2  hours. 

Vagotomy  had  no  significant  effect  upon  the  changes  in  heart  rate  and 
blood  pressure  previously  noted  in  the  intact  dog. 

Figure  10.  The  venom  of  the  Coral  snake  produced  an  initial  rise 
in  arterial  blood  pressure.  This  was  followed  in  30  to  60  seconds  by  a 
sharp  fall  in  arterial  pressure  and  a  decrease  in  pulse  pressure.  Both 
pressures  then  gradually  increased  reaching  normal  or  near  normal 
levels  in  15-30  minutes  post  injection.  At  time  of  severe  respiratory 
embarrassment  arterial  pressure  fell  off  abruptly.  The  hypoxic  rise  in 
systemic  pressure  previously  noted  with  other  venoms  at  time  of  apnea 
was  not  seen.  Immediately  after  venom  administration  a  temporary 
period  of  apnea  was  also  observed  which  lasted  from  3  to  5  minutes. 
Breathing  gradually  returned  to  normal  and  remained  such  until  time  of 
respiratory  failure.  Heart  rate  decreased  abruptly  during  the  time  of 
initial  hypotension.  Heart  rate  returned  to  normal  in  approximately  10 
minutes  and  remained  stable  until  terminal  bradycardia  was  observed. 

EKG  was  not  affected  by  the  venom  until  time  of  respiratory  arrest.  A 
gradual  decrease  in  cortical  electrical  activity  was  noted  at  from  3  to  5 
minutes  post  injection.  This  change  was  reversible  and  EEG  returned  to 
normal  or  near  normal  15  minutes  post  injection.  A  second  loss  of 
cortical  activity  was  noted  at  the  terminal  stage  at  a  time  when  severe 
respiratory  difficulties  were  apparent.  Average  time  to  death  with 
coral  snake  venom  was  2.  5  hours. 

With  this  venom  an  increase  in  heart  rate  and  a  decrease  in  arterial 
blood  pressure  were  noted  in  both  the  eviscerated  and  the  vagotomized 
animals. 


Design  of  Experiments 


231 


Figure  11.  A  lethal  injection  of  Russell's  Viper  venom  produced  ar> 
immediate  and  irreversible  decline  in  arterial  blood  pressure.  Pulse 
pressure  decreased  as  arterial  pressure  fell  and  remained  narrow  until 
.'oath,  No  terminal  signs  of  hypoxia  were  exhibited  with  this  venom. 
Re«piration  was  not  affected  during  the  initial  post  injection  period. 
However,  at  approximately  ten  minutes  there  was  an  abrupt  cessation  of 
respiratory  movements.  Heart  rate  decreased  as  arterial  blood  pressure 
fell,  showing  some  increase  in  rate  just  prior  to  death.  Following 
respiratory  rrest,  however,  profound  bradycardia  was  noted. 

Progressive  hypoxic  changes  in  EKG  were  noted  after  administration 
of  the  venom.  At  time  of  death  electrical  disassociation  leading  to 
cardiac  arrest  was  seen.  No  alteration  in  electrical  cortical  activity 
was  noted  immediately  post  injection.  Following  the  prolonged  hypo¬ 
tension  a  gradual  decrease  in  activity  was  observed.  At  no  time  prior 
to  death,  however,  was  a  completely  isoelectric  tracing  (EEG  quiescence) 
recorded  such  as  was  observed  with  certain  other  venoms.  Evisceration 
prevented  the  initial  hypotension  and  bradycardia  produced  by  Russell's 
Viper  venom.  A  rather  slow  progressive  decline  in  arterial  blood  pres¬ 
sure  occurred  over  a  15-30  minute  period  of  time.  Death  followed 
respiratory  paralysis.  Vagotomy  did  not  prevent  the  sharp  fall  in  arterial 
blood  pressure  previously  noted  in  the  intact  animal,  however,  bradycardia 
was  prevented  and  a  significant  increase  in  heart  rate  occurred. 

Figure  12.  The  venom  of  the  Puff  Adder  produced  a  somewhat 
transient  fall  in  arterial  blood  pressure,  Following  the  brief  fall  both 
blood  pressure  and  heart  rate  decreased  progressively  over  the  15-30 
minutes  proceeding  death.  An  abrupt  cessation  of  breathing  was  also 
noted  with  this  venom.  Sporadic  irregular  movements  were  observed  at 
approximately  15  minutes  post  injection.  This  was  followed  by  complete 
cessation  of  respiratory  movements.  Profound  bradycardia  and  EKG 
changes  were  noted  shortly  after  envenomation  progressing  rapidly  to 
cardiac  arrest.  Cortical  electrical  activity  decreased  sharply  at 
approximately  3  to  5  minutes,  remaining  "quite"  until  death.  Eviscera¬ 
tion  did  not  prevent  the  initial  fall  in  arterial  blood  pressure  but  did 
•  eliminate  the  sharp  decrease  in  heart  rate  produced  by  the  venom  of 
the  Puff  Adder.  Eviscerated  animals  went  on  to  expire,  however,  in 
much  the  same  manner  as  the  intact  envonomated  dogs.  Vagotomy 
eliminated  the  bradycardia,  allowing  for  an  increase  in  heart  rate  but 
did  not  prevent  the  initial  fall  in  blood  pressure. 
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PISCU8S1QN.  The  re  suit  •  of  this  study  indicated  that  the  toxicity  of 
snaka  venom  is  not  a  species  specific  phenomenon.  Even  though  the 
lethal  dose  of  venom  for  the  mouse  is  in  many  instances  5  to  10  times 
greater  than  that  for  the  dog,  relative  potencies  are  remarkably  similar. 
That  la:  venoms  which  appear  most  toxic  in  the  mouse  are  likewise 
most  potent  in  the  dog  --  the  reverse  of  this  is  also  true.  As  the  potency 
of  the  venom  decreases,  however,  the  difference  between  the  lethal  dose, 
on  a  mg/kg  basis,  for  the  mouse  and  dog  increases.  This  is  most  probably 
due  to  differences  in  the  rate  of  the  metabolism  for  each  species  which 
may  be  obscured  in  the  extremely  potent  venoms.  Our  data  would  tend 
to  substantiate  this  in  that  the  mouse  and  dog  LD^'s  are  quite  alike  for 

two  of  the  more  potent  venoms,  I.  E.  Russell's  Viper  and  Coral  snake. 

The  Injection  of  a  lethal  dose  of  snake  venom  produced  a  precipitous 
fall  in  arterial  blood  pressure  and  a  marked  decrease  in  heart  rate. 

This  is  not  unlike  those  changes  observed  following  administration  of 
certain  other  toxins  where  hypotension,  bradycardia  and  decreasod 
venous  return  have  been  observed  and  are  attributed  to  the  hepatoeplanchnic 
pooling  of  blood  (2,  2,  4).  In  this  study  surgical  removal  of  the  viscera 
prior  to  envenomation  was  seen  to  prevent  the  initial  fall  in  blood  pressure 
and  apparent  pooling  of  blood  in  dogs  administered  either  cobra  or 
Russell's  Viper  venoms.  These  data  support  the  concept  that  these 
venoms  produce  a  marked  pooling  of  blood  in  the  hepatoeplanchnic  bed 
of  the  dog.  Evisceration,  however,  did  not  prevent  death  of  the  animals. 
With  Rattlesnake  and  Krait  venoms  evisceration  modified  but  did  not 
prevent  the  initial  fall  in  arterial  blood  pressure.  This  is  most  probably 
due  to  pooling  of  blood  in  the  pulmonary  tissues  as  well  as  in  the  hepato- 
splanchnic  bed  (5,6).  Pulmonary  vascular  pooling  per  se  is  also  thought 
to  occur  with  the  venoms  of  the  puff  adder,  coral  snake,  copperhead  and 
cottonmouth  moccasin.  In  these  cases  evisceration  did  not  in  any  way 
modify  the  initial  drop  in  blood  pressure  previously  observed  in  the  intact 
dog.  Studies  are  currently  underway  in  these  laboratories  to  more 
closely  examine  this  phenomenon. 

A  cholinergic *like  response  has  been  described  following  the  injec¬ 
tion  of  gram  negative  endotoxin  in  which  a  decrease  in  heart  rate  was 
noted  and  appeared  to  be  due  to  an  increase  'n  parasympathetic  tone  (7). 
Lethal  doses  of  venom  also  produced  bradycardia  in  conjunction  with  the 
early  fall  in  blood  pressure.  Bilateral  vagotomy  prior  to  administration 
of  venom  not  only  eliminated  the  slowing  of  the  heart  but  actually  allowed 
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for  a  significant  increase  in  rate.  Vagotomy  in  deference  to  evisceration 
did  not,  however,  prevent  the  initial  fall  in  Hood  pressure  nor  did  it 
in  any  way  alter  the  ultimate  lethal  effects  of  envenomation.  The  only 
exception  in  this  study  of  venoms  was  found  with  that  of  the  Indian  Krait, 
which,  if  administered  following  vagotomy,  did  not  result  in  either  a 
decrease  in  heart  rate  or  blood  pressure.  All  animals  treated  in  this 
manner  did  eventually  expire,  however. 

The  effect  of  certain  venoms  on  cortical  electrical  activity  has 
previously  been  described  (8,9).  This  study  confirmed  the  earlier  reports 
in  that  a  marked  change  in  EEG  was  observed  following  the  intravenous 
administration  of  crude  cobra  venom.  This  observation  has  been  extended 
to  include  the  venoms  of  the  Eastern  and  Western  Diamondback  Rattle¬ 
snakes  and  the  Puff  Adder.  No  significant  changes  in  EEG  appeared  to 
occur  with  the  remaining  venoms.  The  mechanism  by  which  the  venoms 
produced  a  quieting  of  cortical  activity  is  as  yet  obscure. 

The  most  nebulous  aspect  of  this  study  was  the  apparent  mode  of 
death  by  which  the  venoms  produced  their  lethal  effects.  For  the  most 
part  the  primary  mechanism  of  death  appeared  to  bet  of  a  respiratory 
nature.  It  is  important  to  note ,  however,  that  the  respiratory  failure 
observed  with  certain  venoms  followed  a  prolonged  period  of  hypotension. 
The  apparent  cause  of  respiratory  failure  may  not  in  fact  be  due  to  the 
direct  action  of  venom  on  the  respiratory  system  but  to  a  medullary 
hypoxia.  None  the  less  it  has  been  proposed  by  some  that  cobra  venom 
produces  respiratory  paralysis  by  Interference  with  nerve  impulse  trans¬ 
mission  at  the  myoneural  junction  of  the  diaphragm  (10,  11).  Others 
postulate  that  this  phenomenon  may  be  the  result  of  Increased  nerve 
membrane  permeability  (12).  Although  other  venoms  may  act  in  much 
the  same  manner  as  cobra  venom  preliminary  observations  would 
indicate  that  central  respiratory  involvement  is  indeed  a  possibility. 
Halmagyi  et  al  have  shown  that  rattlesnake  venom  decreases  sensitivity 
of  medullary  respiratory  neurons  rather  than  affect  either  the  peripheral 
nerve  or  neuromuscular  apparatus  (13).  These  possibilities  have  not 
yet  been  explored. 

SUMMARY.  Lethal  doses  of  venom  representing  three  families 
of  poisonous  snakes  (Crotalldae,  Elapidae  and  Viperidae)  were  admin¬ 
istered  intravenously  to  mice  and  dogs.  The  approximate  lethal  dose 
of  ten  venoms  was  established,  as  well  as  a  characterisation  of  thf. patho- 
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physiological  ovonts  proceeding  death  in  the  anesthetiaed  dog.  Result* 
indicate: 

1.  On  a  mg/kg  basis  the  lethal  dose  of  each  venom  for  the  dog 
is  significantly  less  than  that  for  the  mouse. 

2.  The  venom  which  is  most  potent  in  the  dbg  is  also  the  one  that 
is  most  potent  in  the  mouse.  Likewise,  the  venom  which  is 
the  least  potent  in  the  dog  is  also  the  one  that  is  least  potent 
in  the  mouse . 

3.  All  venoms  produced  a  precipitous  fall  in  arterial  blood 
pressure  immediately  post  injection  which  appeared  to  be 
due  to  pooling  of  blood  in  the  viscera  and/or  the  pulmonary 
vasculature. 

4.  A  significant  decrease  in  heart  rate  occurred  simultaneously 
with  the  drop  in  arterial  blood  pressure  and  can  be  completely 
eliminated  by  prior  vagotomy. 

5.  The  venoms  of  the  Indian  Cobra.  Rattlesnake  and  the  Puff 
Adder  all  produced  a  marked  decrease  in  cortical  activity 
immediately  following  injection. 

6.  The  apparent  mode  of  death  with  these  venoms  appeared  to  be 
respiratory  in  nature  although  the  role  of  prolonged  cardio- 
vascular  hypotension  has  not  yet  been  fully  evaluated. 
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Listing  of  Snake  Venose  Arranged  According  to  Order  of  Decreasing  Toxicity. 
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FIGURE  7 
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FIGURE  II 
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FIGURE  12 


PIRICULARIA  ORYZAE  -  RELATIONSHIP  BETWEEN 
LESION  COUNTS  AND  SPORE  COUNTS 

Thomas  H.  Barksdale,  William  D,  Brener, 
Walter  D.  Foster,  and  Marian  W.  Jones 
U.  S.  Army  Biological  Laboratories 
Fort  Detrick,  Frederick,  Maryland 


INTRODUCTION;  On  theoretical  grounds,  one  spore  of  Pirlcularla 
oryzae  can  cause  one  lesion  on  a  rice  leaf  under  suitable  environmental 
conditions.  In  nature,  however,  when  a  plant  population  (that  has  leaves 
oriented  in  all  possible  planes)  is  exposed  to  a  population  of  spores  the 
ratio  of  spores  to  lesions  is  necessarily  much  greater  than  1;  1.  This  is 
true  for  a  variety  of  reasons,  e.  g.  ,  (a)  not  all  spores  are  viable, 

(b)  not  all  spores  land  on  leaves  because  most  fall  on  soil  or  are  carried 
away  from  fields  by  air  currents,  and  (c)  not  all  leaves  are  equally 
susceptible.  It  was  desired  to  simulate  natural  conditions  and  to  find 
the  relationship  between  spore  and  plant  populations  in  terms  of  a  sample 
of  a  spore  cloud  and  lesion  counts,  respectively.  Of  particular  interest 
was  an  estimate  of  the  range  of  spore  counts  below  which  lesions  are 
not  likely  to  form,  and  above  which  lesions  will  usually  appear. 

MATERIALS  AND  METHODS;  Weighed  amounts  of  a  dry  spore 
preparation  (1)  of  Firicularia  oryzae,  Race  1,  were  discharged  with  a 
CO,  pistol  into  a  small  closed  chamber  (30  x  18  inches  x  26  inches  high) 
placed  flush  against  an  ordinary  chemical  fume  hood  with  a  floor  surface 
35  inches  across  and  28  inches  deep  (Figure  1. ). 

After  one  minute  was  allowed  for  the  cloud  to  equalize,  the  front 
and  rear  sides  of  the  chamber  were  quickly  removed,  and  the  cloud 
was  drawn  through  the  hood.  Pots  of  one -month-old  Gulfrose  rice 
plantB  were  arranged  in  the  hood  to  the  front-left,  front-right,  rear- 
left,  and  rear-right  of  hood  center  where  a  rotobar  spore  sampler  was 
located  at  plant  height.  Spores  collected  on  the  rotobar  were  counted 
after  each  run. 

Four  runs  (designated  Al,  A2,  Bl,  and  B2)  per  day  were  made  on 
each  of  five  successive  days.  The  following  amounts  of  inoculum  were 
used  for  runs  designated  "A":  2,  4,  8,  16,  and  32  mg;  for  runs  "B": 

1,  5,  10,  25,  and  50  mg  were  used.  Following  inoculations  on  a  given 
day,  plants  were  placed  in  dew  chambers  at  72  to  75°F  for  16  hours, 
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after  which  they  were  placed  on  a  greenhouse  bench.  Eight  days  later, 
data  for  each  pot  were  taken  in  terms  of  (a)  number  ->f  lesions,  and 
(b)  number  of  leaves. 

ANALYSIS:  Variables  for  analysis  were  "lesions  per  leaf"  and 

"number  of  spores  on  rotobar".  We  had  hoped  to  find  transformations  of 
•pore  counts  and/or  lesion  counts  that  would  linearize  the  relationship 
between  the  two  variates,  X 'intercepts  of  tolerance  limits  for  the 
regression  line  providing  the  desired  range  of  spore  counts,  as  shown 
in  Figure  2, 

Some  of  the  mathematical  models  investigated  are  shown  as  follows: 


Number 

1 

2 

3 

4 

5 


Equation 

Log  [Log(Lesions  +  a^)  +  a^]  *  a  +  p  (Log  Spores) 

Log  [Log(Leeions  +  a^)  +  a.,]  =  a  +  (3  (Spores) 

Log  [Log( Lesions  -  Background  +  a^)  +  a2]  =  a  +  (3  (Log  Spores) 
Log  (Lesions  -  Background  +  a)  =  a  +  (3  (Spores) 

(Lesions)*^  =  a  +  (3  (Spores)  . 


Preliminary  tests  based  on  a  more  limited  range  of  spore  counts 
indicated  that  Equation  1,  which  is  a  special  case  of  the  Weibull  function, 
linearized  the  data  for  each  individual  test;  however,  parameters  varied 
among  tests.  At  that  time  the  variation  was  attributed  to  non-standard 
experimental  variables.  For  the  tests  discussed  in  this  paper,  particular 
emphasis  was  placed  on  standardization  of  experimental  variables  such 
as  method  of  firing  the  CO^  pistol,  time  elapsed  between  steps  in  the 
procedure,  and  plant  age.  Also,  an  extended  range  of  spore  counts  was 
used.  Equation  1  did  not  linearize  the  data  obtained  from  these  tests. 

Equation  2  differs  from  Equation  1  in  that  original  spore  counts 
were  not  transformed.  This  equation  resulted  in  linearity,  but  varia¬ 
tion  in  the  transformation  of  lesions  increased  with  number  of  spores  on 
rotobar,  and  a  positive  Y-intercept  was  obtained.  When  this  equation 
was  fitted  to  the  data,  approximately  52%  of  the  variation  in  "lesions  per 
leaf"  was  explained. 
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The  positive  Y-intercept  of  Equation  2  indicates  that  some  lesions 
would  have  been  formed  in  the  absence  of  spores.  This  is  not  possible. 
The  data  could  have  resulted  from  use  of  previously  infected  plants,  or 
if  the  chamber  and/or  hood  were  contaminated  from  previous  runs.  It 
was  assumed  that  some  background  was  present,  and  the  average  number 
of  lesions  obtained  with  very  low  spore  counts  was  used  as  an  estimate 
of  this  background,  shown  in  Equation  3.  A  linear  relationship  in  this 
transformation  did  not  exist. 

A  plot  of  the  data  transformed  as  in  Equation  4  gave  results  similar 
to  those  obtained  with  Equation  2;  i.e.  ,  the  function  appeared  to  be 
linear,  but  with  unequal  variances  in  the  transformation  of  lesions,  and 
it  appeared  that  o  positive  Y-intercept  would  still  exist,  Results  from 
Equations  2  and  4  did,  however,  seem  to  imply  that  spores  should 
remain  untransformed. 

Equation  5  gave  the  desired  properties  of  linearity  and  homogeneity 
of  variances.  When  this  equation  was  fitted  to  the  data,  results  shown 
in  Figure  3  were  obtained.  This  equation  explained  about  65%  of  the 
variation  in  "lesions  per  leaf".  A  positive  Y-intercept  is  again  evident. 
Untransformed  data,  together  with  the  fitted  equation  and  80%  tolerance 
limits  for  individual  values  are  shown  in  Figure  4. 

DISCUSSIONS  AND  PROBLEMS:  Our  problem  is,  of  course,  that 
we  did  not  obtain  the  expected  positive  X-intercept  from  which  tolerance 
limits  for  individual  values  would  have' given  an.  estimate  of  the  range 
of  number  of  spores  below  which  lesions  would  not  form  and  above  which 
lesions  would  usually  appear. 

Some  deficiencies  of  the  experimental  design  have  occurred  to  us. 

First,  we  should  have  included  runs  in  which  no  spores  were 
released  as  a  check  on  methods  and  a  measure  of  any  background  that 
*  may  have  been  present. 

Second,  consideration  should  have  been  given  to  the  ratio  of  leaf 
area  (this  Involves  orientation  of  the  leaves  among  other  factors)  to 
volume  of  air  sampled  by  the  rotobar.  An  attempt  should  have  been 
made  to  equalize  the  probability  of  obtaining  one  spore  on  the  rotobar 
and  the  probability  of  one  spore  landing  on  any  one  leaf  in  the  hood. 
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Detection  of  small  numbers  oi  spores  presents  a  technical  problem. 
Since  our  interest  is  in  a  range  of  counts  that  is  probably  low,  perhaps 
it  might  be  more  accurate  to  estimate  lesion  counts  expected  from  low 
spore  counts  by  extrapolation  of  a  function  derived  from  a  range  of  higher 
counts,  in  which  we  have  more  confidence. 

We  have  been  measuring  number  of  spores  collected  at  plant 
height.  Perhaps  this  is  not  the  measurement  wu  need.  Fallout  would  not 
be  included  in  this  measurement.  Perhaps  an  additional  measurement  of 
spores  collected  from  the  floor  of  the  hood  should  be  made.  We  may  need 
a  measurement  of  the  cloud  before  it  reaches  the  plants,  in  which  case 
should  we  go  to  a  wind  tunnel  ? 

Suggestions  for  the  design  and  analysis  of  an  experiment  to  find  the 
relationship  between  spore  counts  and  lesion  counts,  particularly  the 
range  of  spore  counts  below  which  lesions  will  not  be  likely  to  form  and 
above  which  lesions  will  usually  appear,  will  be  appreciated. 


REFERENCES 


1.  Andersen,  A.  L.  ,  B.  W.  Henry,  and  E,  C.  Tullis,  1947.  Factors 
affecting  infectivity,  spread,  and  persistence  of  Piricularia 
oryzac  Cav.  Phytopathology  37:94-110. 


f  ■  ; 

i  -  i 


.r 


£. 

£- 

Sf 

i? 

V 

<• 


267 


I 


| 

f 

i 


ft 


Figure  1;  Xnooulation  Jquijowmt 

Showing  location  of  plant*  and  rotooar  aporo  aanpler  1b  the 
fa»  hood,  and  the  cloud  ohaaber  with  removable  aide*  1a  freat 
or  tat  nood** 


5 


I 


i 

i  i. 


Trasafom  of  Sport  C  qua  to 


Foist  £  if  ths  dMlrtd  talus  oo  tht  sport  ult  tolas  *aiafc 

f 

1st loss  viU  not  to  ilktly  to  tom  sod  foist  £  Is  tto  miss 
abort  wxiak  Irnloes  will  uraaUjr  ap§«a r, 


Fljfurt  2 


s 


Number  of  Spores  on  Rotobe.r 


EXTFFW.  VERTICES  DESIGN  OF  MIXTURE  EXPERIMENTS* 


R.  A.  McLean,  University  of  Tennessee 
and 

V.  L.  Anderson,  Purdue  University 


ABSTRACT .  The  extreme  vertices  design  is  developed  as  a  proce¬ 
dure  for  conducting  experiments  with  mixtures  when  several  factors 
have  constraints  placed  on  them.  The  constraints  so  imposed  reduce  the 
size  of  the  factor  space  which  would  result  had  the  factor  levels  been 
restricted  to  only  0  to  100  per  cent.  The  selection  of  the  vertices 
and  the  various  centroids  of  the  resulting  hyper -polyhedron  as  the  design 
is  a  method  of  determining  a  unique  set  of  treatment  combinations.  This 
selection  is  motivated  by  the  desire  to  explore  the  extremes  as  well  as 
the  center  of  the  factor  space. 

A  non-linear  programming  procedure  iB  used  in  determining  the 
optimum  treatment  combination. 

INTRODUCTION.  In  experiments  dealing  with  mixtures  one  studies 
the  response  surface  of  a  given  dependent  variable,  y,  (e.  g.  ,  amount 
of  illumination,  in  candles,  for  a  given  size  flare)  as  a  function  of  q 
factors  (q  js  3).  The  q  factors  (components)  are  all  represented  by  a 
proportion,  x. ,  of  the  total  mixture.  Thus 

q 

fit  2  x,  =  1  and  0  <  a.  <  x.  <  b.  <  1 

t  '  .,1  —1—1—1— 

i=l 

where  i  =  1,  .  ,  q,  and  the  a.  and  b.  are  constraints  on  the  x 

imposed  by  the  experimenter  or  by  the  physical  situation  involved. 

Scheff£  [6]  introduced  the  topic  of  mixture  experiments  for  the 
case  a^  =  0  and  b.  =  1  for  i  =  1, .  .  .  ,  q.  He  defined  the  {q.m} 

simplex  lattice  design  as  a  design  which  uniformly  covers  the  factor 
space  with  each  factor  having  m+1  equally  spaced  values  from  0  to  1 

such  that  x^  =  1.  A  complete  (3,2]  lattice  would  consist  of 

-  \ 

’■‘This  paper  has  been  subrriitted  to  Technometrics  for  publication. 
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ob  'at ions  taken  at  the  following  points  (1,0,0),  (0,1,0),  (0,0,1), 
(0,  5,0.  5,0),  (0.  5,0,0. 5),  and  (0,  0.  5,  0.  5)  which  are  seen  to  lie  on 
the  plane  x^  +  x^  +  x^  =  1  in  the  first  octant  (Mason  and  Hazard  [3]  ). 

This  lattice  is  pictured  in  Figure  1  and  redrawn  in  Figure  2  as  a  two- 
dimensional  simplex.  Since  the  example  which  is  presented  later 
contains  four  factors,  a  {4,  3}  lattice  is  presented  in  Figure  3  as  a 
three-dimensional  simplex 


(0,1, 0,0) 


(o, 0,1,0)  dtr- 


(1,0,  0,0) 


Figure  3  -  £4 ,  3 }  lattice 
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(tetrahedron),  In  general  the  q  dimensional  factor  space  will  reduce 
to  a  (q-1)  dimensional  polyhedron. 

Sc'neffe  discusses  the  use  of  a  polynomial  of  degree  n  in  estimating 
the  response  function  defined  on  any  ^q,n}  lattice  design.  A  simple 
procedure  is  derived  for  estimating  the  regression  coefficients  for  these 
polynomials  in  the  case  of  fq,2}  and  (q,  3}  lattice  designs.  This 
method  was  extended  by  Gorman  and  Hinman  [2]  to  the  case  of  a  (q.4) 
lattice  design  and  the  corresponding  quartic  polynomials.  Both  of  these 
papers  give  detailed  information  on  testing  the  fit  of  the  polynomial  to 
the  response  surface  and  for  determining  the  variance  of  a  predicted 
response. 

Scheffe  briefly  discusses  the  problem  where  one  factor  has  an 
upper  bound  less  than  one,  thus  the  restriction  x^  <  b^  <  1  for  some 

i.  The  notion  of  a  "pseudocomponent"  (coding  of  the  original  variables) 
is  introduced  which  permits  the  establishment  of  a  regression  equation 
in  terms  of  the  coded  variables.  It  is  pointed  out  that  this  procedure 
can  be  extended  to  more  than  one  factor.  It  is  also  noted  that  the 
design  of  the  experiment  for  this  method  has  a  shortcoming  of  not  com¬ 
pletely  covering  the  factor  space  of  interest. 

It  is  the  purpose  of  this  paper  to  introduce  a  design  which  will 
allow  each  factor  to  be  constrained  as  described  in  {l}  and  cover  the 
extremes  of  the  factor  space.  It  is  assumed  throughout  that  the  degen¬ 
erate  situation  of 

q  q 

£  a.  >  1  or  £  b.  <  1 


does  not  occur.  In  the  case  of  either  equality  only  one  treatment 

combination  would  be  feasible,  i.  e.  ,  either  (a, ,.  .  .  ,  a  )  or  (b, , .  .  .  ,  b  ), 

1  q  l  q 

respectively.  In  the  case  where  =  b.  for  the  ith  factor,  the 

dimensionality  of  the  factor  space  is  reduced  by  one  and  the  remaining 
components  must  sum  to  (i-a^)  which  indicates  that  the  design  problem 

Hence  we  also  assume  that  a^  ^  b^  for  any 


is  essentially  the  same. 

i  =  !,•••►  q- 
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EXTREME  VERTICES  DESIGN.  The  constraints  placed  on  the 
individual  factors  describe  an  irregular  hyper-polyhedron  (q-1  dimen¬ 
sions).  The  vertices  and  centroids  of  this  figure  describe  a  unique  set 
of  points  (the  design  of  the  experiment)  which  may  be  used  to  estimate 
the  response  surface.  Throughout  the  paper  it  will  be  assumed  that 
there  is  a  sufficient  number  and  adequate  placement  of  points  in  the 
design  to  permit  estimation  of  all  parameters  in  the  polynomial  that  is 
used  to  approximate  the  response  surface.  In  the  case  of  the  quadratic 
model 


|2\ 


q 

y  =  2  3  x.  + 

'  l  l 


2 

1  <  i  <  j  <  q 


p. ,x.x. 
lJ  1  J 


which  is  used  exclusively  in  this  paper,  a  minimum  of  iq(q+l)  points 
arc  required.  Additional  points  would,  of  course,  be  necessary  if  an 
estimate  of  error  is  needed  or  if  the  lack  of  fit  is  to  be  tested.  In  case 
additional  points  are  desired  in  any  given  design  they  may  be  obtained, 
for  example,  by  using  mid  points  of  the  edges  of  the  hyper-polyherdon 
or  repeating  some  of  the  existing  points.  A  more  elaborate  description 
of  the  various  polynomials  that  may  be  used  can  be  found  in  the  Gorman 
and  Hinman  paper. 

Once  the  constraints  for  each  factor  are  given  all  the  points  of  the 
basic  design  are  uniquely  determined.  The  vertices  of  the  design  must 
fall  on  the  boundary  formed  by  upper  or  lower  constraints  of  (q-1)  factors. 
Hcncc,  the  vertices  of  the  design  may  be  found  by  using  the  two  following 
rules: 

(1)  Write  down  all  possible  two  level  treatment  combinations 

using  the  a.  and  b.  levels  for  all  but  one  factor  which 
l  i 

is  left  blank,  e.g.  (a,,  b  ,  a  ,  - ,  a  ,  b.)  for  a  six  factor 

i  2  3  ou  j 

experiment.  This  procedure  generates  q-  2*^  possible 
treatment  combinations  with  one  factor's  level  blank  in 
each. 

,  Q-1 

(2)  Go  through  all  q-  2  possible  treatment  combinations 
and  fill  in  those  blanks  that  are  admissible,  i.  e.  ,  that 
level  (necessarily  falling  within  the  constraints  of  the 
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missing  factor)  which  will  make  the  total  of  the  levels 
for  that  treatment  combination  add  to  one.  Each  of  the 
admissible  treatment  combinations  is  a  vertex;  how¬ 
ever,  some  vertices  may  appear  more  than  once. 

The  hyper-polyhedron  so  constructed  contains  a  variety  of  centroids. 
There  ie  one  located  in  each  bounding  2 -dimensional  face,  3-dimen¬ 
sional  face,  .  .  .  ,  r-dimensional  face  (r  <  q  -  2),  and  the  centroid  of 
the  hyper-polyhedron.  The  latter  point  being  the  treatment  combination 
obtained  by  averaging  all  the  factor  levels  of  the  existing  vertices.  The 
centroids  of  the  2-dimensional  faces  by  isolating  all  sets  of  vertices 
for  which  each  of  (q-3)  factor  levels  remains  constant  within  a  given 
set  and  by  averaging  the  factor  levels  for  each  of  the  three  remaining 
factors.  All  remaining  centroids  are  found  in  a  similar  fashion  using 
all  vertices  which  have  (q-r-1)  factor  levels  constant  within  a  set 
for  an  r-dimensional  face  where  3  <  r  <  q  -  2.  It  should  be  noted 
that  under  the  assumptions  given  above  the  dimensionality  of  the  hyper¬ 
polyhedron  formed  by  the  extreme  vertices  will  always  be  q-1. 

EXAMPLE.  In  manufacturing  one  particular  type  of  flare  the  chem¬ 
ical  constituents  are  magnesium  (x^) ,  sodium  nitrate  (x^),  strontium 

nitrate  (x^),  and  binder  (x^).  Engineering  experience  has  indicated  that 

the  following  constraints  on  a  proportion  by  weight  basis  should  be 
utilized: 


.40  <  x1  <  .60, 

.10  <  x^  <  .50, 

•10  1  x3  <  •  50, 

and  .03  <  x  <  .08  • 

The  problem  is  to  find  the  treatment  combination  (x  ,  x  ,  x  ,  x  ) 

*  Ct  0  *z 

which  gives  maximum  illumination  as  measured  in  candles. 

The  vertices  of  the  polyhedron  consisting  of  all  the  admissible 
points  of  the  factor  space  are  found  by  applying  rules  (l)  and  (2)  above. 


The  listing  appears  as 
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Treatment  Treatment 


Combination 

X2 

X3 

X  .  C  UmUii'iKttiC 

4 

It  X  ^ 

X2 

V 

“3 

•v 

"4 

.  40 

.10 

.  10 

(1) 

.  40 

.  10 

.  47 

.  03 

.40 

.10 

.  50 

U) 

.  40 

.10 

.  42 

.  08 

.40 

.  50 

.10 

.  40 

.  50 

.  03 

.  40 

.  50 

.  50 

.  40 

.  50 

.  08 

.  60 

.10 

.10 

(3) 

.  60 

.  10 

,  27 

.  03 

.  60 

.  10 

.  50 

(4) 

.  60 

.  10 

.  22 

.  08 

.  60 

.  50 

.  10 

.  60 

.  50 

.03 

.  60 

.  50 

.  50 

- 

.  60 

,  50 

- , 

.  08 

(5) 

.  40 

■  47 

.  10 

.03 

.  10 

.  10 

.  03 

(6) 

.  40 

.42 

.10 

.  08 

.  10 

.  10 

.  08 

.40 

.  50 

.03 

.  10 

.  50 

.  03 

.40 

.  50 

.  08 

.  10 

.  50 

.  08 

(7) 

.60 

■  27 

.10 

.  03 

.  50 

.  10 

.  03 

(8) 

.60 

.  22 

.10 

.  08 

.  50 

.10 

.  08 

.  60 

.  50 

.  03 

,  50 

.  50 

.  03 

.60 

.  50 

,  08 

.  50 

.  50 

.  08 

thus  indicating  eight  admissible  vertices  and  six  faces.  These  eight 
treatment  combinations  are  shown  in  Figure  4. 

In  order  to  complete  the  design,  one  must  determine  the  six  centroids 
for  each  face  and  the  centroid  for  the  polyhedron.  To  do  this  we  list 
the  treatment  combinations  that  make  up  the  six  faces  with  the  resulting 
centroids  as  follows: 
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Treatment  Treatment  r.n™hination 

Combination _ Centroid _  which  form  the  face 

(9)  (.50,  ,1000,  .3450,  .055)  (1)  ,  (2),  (3).  (4) 

(10)  (.50,  ,3450,  .1000,  .055)  (5),  (6),  (7),  (8) 

(11)  (.40,  .2725,  .2725,  .055)  (l),  (2),  (5),  (6) 

(12)  (.60,  ,1725,  .1725,  .055)  (3),  (4),  (7),  (8) 

(13)  (.50,  .2350,  .2350,  .030)  (l),  (3),  (5),  (7) 

(14)  (.50,  .2100,  .2100,  .080)  (2),  (4),  (6),  (8) 

and  the  final  centroid  of  the  polyhedron,  of  course,  comes  from  the 
average  of  all  eight  treatment  combinations  and  is 

(15)  (.50,  .2225,  .2225,  .055). 

Fifteen  flares  assembled  at  each  of  the  above  treatment  combina 
tions  produced,  respectively,  the  following  amounts  of  illumination 
(measured  in  1000  candles): 


(1) 

75 

(6) 

230 

(11) 

190 

(2) 

ISO 

(7) 

220 

(12) 

310 

(3) 

195 

(8) 

350 

(13) 

260 

(4) 

300 

(9) 

220 

(14) 

410 

(5) 

145 

(10) 

260 

(15) 

425 

Standard  least  squares  techniques  were  used  on  the  above  data  to 
obtain  the  complete  quadratic  model  (equation  (2]  above) 

y  =  -l,558*l  -  2,351x2  -  2,426x3  +  14,372x4  +  8,300*^  +  8,076x1x3 
6,625Xlx4  +  3,213x2x3  -  16,  998x2x4  -  17,127x3x4  . 
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/w2\ 


is  .9833,  with  only  five  degrees  of  freedom  for  residual.  If  only 

x. ,  x_,  x,,  x,x„,  x,x„,  x,x,,  x_,x„  were  used,  the  would  be  .9829, 

1  u  J  1  C  i  J  1  *t  a  O 

with  eight  degrees  of  freedom.  Since  all  four  variables  still  appeared 
in  the  latter  model,  the  authors  decided  to  retain  the  full  model.  The 
reader  should  recognize  that,  as  in  any  model  development  problem,  one 
must  have  stopping  rules  for  evolving  models  from  data.  The  purpose 
of  this  example,  however,  is  merely  to  demonstrate  the  use  of  the 
regression  model  to  determine  the  optimum  treatment  combination  not 
to  elaborate  on  model  development,  per  se. 


In  an  attempt  to  determine  the  optimum  treatment  combination, 
Lagrange  multipliers  were  utilized  to  determine  the  maximum  of  the 
above  equation  subject  to  the  constraint 

4 


The  resulting  equations  to  be  solved  are 


8 , 309x^  +  6,  076x3  -  6,  625x4  +  X.  =  1 ,  558 
8.300XJ  +  3,213x3  -  16,998x4  +\  =2,351 
8 , 076xj  +  3,213x2  -  17,127x4  +  X  =  2,426 

-6, 62  5x  -  16, 998x  -  17,127x  +  \  -  -14,372 

•  Ct  J 


*1  + 


X.  +  X,  +  x„ 
2  3  4 


1 


where  X  is  the  unknown  Lagrange  factor.  The  solution  to  these 
equations  indicates  the  optimum  treatment  combination  is 

(.5020,  .2786,  .2203,  0009) 

which  i6  obviously  incorrect  since  all  factor  levels  must  be  positive. 
It  should  be  noted  that  the  above  approach  would  only  be  valid  if  the 
resulting  factor  levels  (for  the  maximum  y)  fell  within  their  respec¬ 
tive  constraints. 
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In  order  to  consider  ali  the  necessary  constraints,  a  more  appro¬ 
priate  tool  was  utilized  in  estimating  the  optimum  treatment  combin¬ 
ation.  A  non-linear  programming  routine  (SHARE  program  No.  1399 
Gradient  Projection  (G  P  90)  by  Ruin  P.  Merrill,  Sheii  Development  Co. 
Emeryville,  California)  was  used  to  yield  the  treatment  combination 


(.5233,  .2299,  .1668,  .0800) 


which  is  the  desired  solution  to  the  problem.  The  predicted  value  of  y 
for  this  optimum  point  is  397.48.  It  should  be  noted  that  this  procedure 
only  guarantees  an  optimum  in  the  case  where  the  response  surface  is  a 
concave  function. 

It  is  quite  feasible  that  one  would  like  to  further  verify  the  initial 
estimate  of  the  optimum  condition.  This  could  be  done  by  applying 
anotherextreme  vertices  design  to  a  localized  region  containing  this 
initial  point. 

An  additional  comment  on  this  experiment  is  that  the  fifteenth  obser¬ 
vation  seems  to  be  too  large  for  the  equation  used.  Further  experimenta¬ 
tion  is  necessary  to  investigate  this  feature  thoroughly.  It  is  hoped  that 
this  peculiarity  does  not  detract  from  the  purpose  of  the  paper,  namely 
to  show  a  unique  design  of  experiment  for  mixture  problems. 

FEATURES  OF  THE  DESIGN.  The  extreme  vertices  design  for 
mixture  problems  is  uniquely  determined  once  the  investigator  decides 
on  the  constraints  for  the  chosen  factors  to  be  used  in  the  experiment. 

In  addition,  the  design  allows  investigation  of  the  extreme  points  of  the 
factor  3pace  as  well  as  internal  points  in  a  manner  similar  to  that  used 
quite  successfully  in  evolutionary  operations.  As  pointed  out  in  the 
example  above,  this  design  can  be  used  sequentially  to  locate  the  opti¬ 
mum  treatment  combination. 

As  with  all  factorial  type  experiments  the  number  of  treatment 
combinations  increases  quite  rapidly  as  the  number  of  factors  increases. 
As  a  guide  to  the  number  of  treatment  combinations  which  one  might 
expect,  Table  1  displays  the  minimum  number  of  vertices  and  number 
of  centroids  in  the  2 -dimensional  faces,  3-dimensicnal  faces,  etc.  ,  ior 
use  in  designs  containing  up  to  8  factors.  Formulae  for  determining 
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the3e  figures  as  well  as  conservative  upper  bounds  on  the  number  of  treat- 

..  .  .  .  .  r*  r  ~  n  ......  . 

ment  comDinavione  ire  given  in  a  paper  uy  .  nuuiuuiidi  reaumgo 

on  the  geometry  of  this  type  of  configuration  is  given  in  references  [l] 
and  [4]  .  It  is  seen  in  Table  1  that  the  number  of  the  various  dimensional 
faces  rapidly  increases  as  the  number  of  necessary  treatment  combina¬ 
tions-  One  way  of  reducing  the  number  of  observations  would  be  to  delete 
certain  centroids,  say,  those  belonging  to  the  even  dimensional  faces. 


Table  1 


Minimum  design  structure  compared  to 
number  of  parameters  for  a  quadratic  mode  1 


Numbe  r 

F  ace 

dimension 

Minimum 

of 

a 

Vertices 

2 

2 

4 

b 

6 

7 

design  size 

Parameters 

3 

3 

1 

4* 

6 

4 

4 

4 

1 

9* 

10 

5 

5 

10 

5 

1 

21 

15 

6 

6 

20 

15 

6 

1 

48 

21 

7 

7 

35 

35 

21 

7 

1 

106 

28 

8 

8 

56 

70 

56 

28 

8 

1 

227 

36 

^Extreme  vertices  design  would  have  to  be  augmented  with 
additional  points  if  these  cases  occur. 


Another  method  for  reducing  the  size  of  the  design  would  be  to 
compute  a  normalized  distance  between  points  of  the  design  and  randomly 
omit  points  that  are  less  than  a  certain  minimum  distance  from  other 
design  points.  The  minimum  distance  and  the  method  of  normalization, 
which  would  be  required  if  certain  components  are  much  more  sensitive 
than  others,  would  have  to  be  chosen  by  the  experimenter.  One  possible 
means  of  normalization  would  be  to  define  the  distance  between  two 
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This  method  of  normalization  would  assume  that  the  sensitivity  for  each 
factor  is  inversely  proportional  to  the  length  of  its  constraint  interval. 

In  light  of  the  above  discussion  it  is  easily  seen  that  a  computer 
program  for  determining  the  various  extreme  vertices,  centroids,  and 
normalized  distances  between  points  would  be  highly  desirable  when  q 
gets  greater  than  4  or  5.  At  the  moment,  no  such  program  exists; 
however,  writing  such  a  program  should  not  be  too  difficult. 
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INTRODUCTION.  Present-day  requirements  for  extremely  high- 
power  radar  and  communication  systems,  high-energy  particle  accelerators, 
and  ion-propulsion  systems  demand  reliable  operation  of  components  at 
voltages  up  to  a  million  volts.  The  demand  is  greatest  in  components  such 
as  vacuum  tubes,  vacuum  capacitors,  and  ion-propulsion  systems  where 
high  voltage  must  be  insulated  by  vacuum  in  small  spaces.  Therefore,  a 
reliable  relation  between  the  hold-off  voltage  and  the  factor  or  factors 
that  affect  an  electrical  breakdown  in  vacuum  is  needed  for  the  design  of 
these  components. 

The  mechanism  of  voltage  breakdown  in  a  vacuum  medium  has  been 
the  object  of  wide  investigation  for  many  years.  In  spite  of  the  voluminous 
literature  on  the  subject,  there  are  insufficient  data  available  to  permit  a 
straight-forward  approacn  to  the  design  of  high-voltage  sections  of  high- 
power  electron  tubes  or  other  types  of  devices.  In  a  study  of  the  available 
literature,  one  finds  a  wide  divergence  in  both  the  data  and  the  theories 
that  have  been  generated  from  the  data.  Fig.  1  shows  the  spruad  of  the 
scattered  data: 


Fig.  1.  Breakdown  Data  -  Voltage  versus  Distance 

* Sponsored  by  Advanced  Research  Projects  Agency  (ARPA  Order  No.  517), 
PROJECT  DEFENDER,  under  ECOM  Contract  DA28-043  AMC -00394(E) 
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These  curves  are  a  few  of  the  most  closely  grouped  breakdown  curves  of 
those  reported.  For  each  curve  a  new  theory  was  probably  presented. 

Our  own  experiments  with  high-voltage  breakdown  showed  that  there  is  more 
than  one  breakdown  mechanism;  a  break  in  the  curve  exists  around  1.  5  mm 
spacing  with  a  slope  of  0.  75  below  1.  5  mm  and  a  slope  of  0.  5  above.  These 
experiments  were  carried  out  in  the  traditional  manner,  varying  the 
distance  between  electrodes  and  recording  a  breakdown  voltage  for  that 
spacing.  It  is  obvious  that,  after  each  breakdown,  measurement  conditions 
in  the  electrode  system  are  changed;  surfaces  are  pitted  or  melted,  gas 
is  liberated,  and  even  the  conductivity  of  the  insulating  vacuum  envelope 
is  changed.  For  ideal  experimentation,  therefore,  a  method  of  avoiding 
breakdown  would  be  desirable. 

FACTORS  AFFECTING  BREAKDOWN.  In  order  to  investigate  the 
mechanism  of  breakdown,  the  16  factors  shown  in  Table  I  were  considered 
as  probably  contributors  to  the  breakdown  process; 

TABLE  I 

FACTORS  AFFECTING  BREAKDOWN 


1. 

Cathode  Material 

9. 

Envelope  Diameter 

2. 

Anode  Material 

10. 

Electrode  Shield  Sixe 

3. 

Cathode  Finish 

11. 

Electrode  Shield  Placement 

4. 

Anode  Finish 

12. 

Residual  Gas  Pressure 

5. 

Cathode  Geometry 

13. 

Energy  of  Supply 

6. 

Anode  Geometry 

14. 

Contaminant 

7. 

Vehicle  Bakeout 

15. 

Magnetic  Field 

8. 

Envelope  Material 

16. 

Electrode  Spacing 

Traditional  experimentation  varying  a  few  of  these  factors  for  each 
experiment  leads  to  the  neglecting  of  joint  effects  of  more  than  one  factor 
and  probably  is  responsible  for  some  of  the  spread  in  data  seen  in  Fig.  1. 
A  full  factorial  experiment,  on  the  other  hand,  would  require  a  prohibitive 
amount  of  experiments  and  time  even  though  tests  were  performed  at  only 
two  levels  of  each  factor.  The  initiation  of  such  a  massive  experiment 
would  only  contribute  to  the  already  existing  chaos  in  this  field. 
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So  as  to  bring  order  to  this  problem,  a  program  of  investigation  of 
the  breakdown  process  was  initiated.  The  program  is  based  on  a  statistical 
design  plan  that  will  consider  all  pertinent  factors,  without  bias,  so  that 
the  significance  of  the  main  effects  and  interaction  effects  can  be  analyzed. 

It  was  recognized*  that  the  list  of  16  factors  could  be  separated  into 
two  groups,  as  shown  in  Table  II: 

TABLE  II 

INFLEXIBLE  AND  FLEXIBLE  FACTORS 

Inflexible  Factors  Flexible  Factors 

1.  Cathode  Material  12.  Residual  Gas  Pressure 

2.  Anode  Material  13.  Energy  of  Supply 

3.  Cathode  Finish  14.  Contaminant 

4.  Anode  Finish  15.  Magnetic  Field 

5.  Cathode  Geometry  16.  Electrode  Spacing 

6.  Anode  Geometry 

7.  Vehicle  Bakeout 

8.  Envelope  Material 

9.  Envelope  Diameter 

10.  Electrode  Shield  Size 

11.  Electrode  Shield  Placement 

The  inflexible  factors  are  those  that  are  constructional.  With  the  ex¬ 
ception  of  factor  7  -  Vehicle  Bakeout  -  they  cannot  be  varied  without  open¬ 
ing  the  vacuum  test  chamber.  The  flexible  factors  are  all  susceptible  to 
being  varied  continuously  without  disturbing  the  test  setup.  It  was  also 
recognized  that  the  last  four  of  the  inflexible  factors  were  factors  concerned 
with  a  particular  application  device  design  and  they  could  be  dropped  at 
this  time  to  reduce  the  complexity  of  the  experiment  and  to  accelerate  the 
investigation.  They  will  be  introduced  into  future  experiments. 

*In  discussions  with  C.  Daniel. 
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The  remaining  (actors  will  be  investigated  at  the  two  levels  shown  in 
Table  111,  recognixing  that  we  are  assuming  a  linear  model.  Future 
experiments  will  allow  us  to  build  a  prediction  model  from  the  results 
and  to  test  at  other  levels  in  each  factor  space. 

EXPERIMENTAL  SETUP.  The  experiments  will  be  run  in  the  test 
vehicle  shown  in  Fig.  2  at  voltages  up  to  320  kilovolts.  The  chamber  is 
equipped  with  access  ports  for  electrode  changes,  optical  viewing,  X-ray 
detectors,  and  a  mass  spectrometer  for  monitoring  the  gap  activity.  For 
cleanliness,  the  whole  chamber  can  be  baked  out  by  an  oven  assembly  sur¬ 
rounding  the  chamber  as  well  as  the  titanium  sputter  pump  appended  to 
the  side,  which  controls  the  degree  of  vacuum.  The  power  supply  is  a 
Van  de  Graaff  generator  that,  for  the  high-energy  level,  charges  up  a  bank 
of  capacitors  to  7000  joules.  For  the  low  level,  the  energy  bank  is  not 
connected.  The  stored  energy  in  this  case  is  less  than  1000  joules.  The 
magnetic  field  is  generated  by  two  large  field  coils  pivoted  at  the  sides  of 
the  chamber  so  that  they  cart  generate  perpendicular,  parallel,  and  oblique 
fields.  The  chamber  is  constructed  so  that  the  factors  that  were  dropped 
for  the  initial  experiment  can  be  included  in  future  experiments  by  placing 
glass  and  ceramic  envelopes  at  two  levels  (large  and  small)  of  diameters 
and  electric  shields  could  be  placed  around  the  electrodes  at  levels  of 
interest.  The  length  of  the  gap  can  be  varied  by  a  drive  mechanism  at  the 
top  of  the  chamber. 

EXPERIMENTAL  PROCEDURE.  The  first  experiments  will  be  con¬ 
ducted  using  a  limited  series  of  trials  consisting  of  32  runs  as  shown  in 
Table  IV.  The  table  constitutes  a  quarter  replicate  of  a  seven-factor 
experiment  taken  at  two  levels  of  each  factor.  The  seven  factors  used 
for  this  test  plan  will  be  the  seven  inflexible  factors  previously  discussed. 
In  each  test  run  the  flexible  factors  will  be  tested  on  a  factorial  basis  at 
two  or  more  levels  for  each  treatment.  Table  IV  was  derived  by  using 
the  live  letters,  A,  B,  D,  E,  and  G,  with  defining  relations,  C  +  ABE, 

F  +  ABDG  in  Table  M  of  Davies'  "Design  and  Analysis  of  Industrial 
Experiments.  "  [1] 

The  design  shows  the  levels  of  each  factor  for  each  of  the  32  runs. 

The  minus  sign  in  each  run  means  that  the  factor  is  either  at  the  low  level 
or  absent  from  the  treatment;  the  plus  sign  means  that  the  factor  is  at  the 
high  level  or  present  in  the  treatment.  The  set  is  orthogonal;  i.  e.  ,  each 
level  of  any  factor  is  tested  equally  against  each  of  the  other  factor  level 
combinations. 
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The  following  letter  assignments  were  carefully  chosen  so  that  in  the 
treatment  and  analysis  of  the  results  the  effect  of  any  two-factor  inter¬ 
action  involving  the  Bakeout  factor,  D,  would  be  clear  of  any  other  main 
effect  or  two-factor  interaction  of  interest; 

TABLE  V 

LETTER  ASSIGNMENT 
A  -  Anode  Material 
B  -  Cathode  Shape 
C  -  Cathode  Material 
D  -  Bakeout 
E  -  Anode  Shape 
F  -  Anode  Finish 
G  -  Cathode  Finish 

The  isolation  of  the  bakeout  main  effect  and  all  two-factor  interactions 
involving  bakeout  was  designed  into  the  experiment  with  the  objective  of 
eliminating  bakeout  in  future  experiments.  Since  bakeout  of  the  large 
mass  of  the  test  vehicle  is  a  long  time  process  of  heating  and  cooling,  it 
would  be  desirable  to  eliminate  it  if  results  indicate  negligible  main  and 
two-factor  effects.  The  absence  of  bakeout  in  the  test  run  involves  the 
use  of  inert  gases  during  the  time  that  the  test  vehicle  is  being  modified 
and  the  testing  of  the  electrodes,  themselves.,  using  built-in  heaters. 
There  is,  therefore,  a  possibility  that  the  lack  of  a  bakeout  of  the  entire 
structure  will,  not  influence  the  test  results. 

A,  B,  and  C  were  assigned  to  factors  whose  interactions  with  each 
other  could  be  assumed  to  be  negligible. 

The  tabulation  of  minus  and  plus  signs  shown  in  Table  IV  not  only 
gives  the  levels  of  the  factors  but  indicates  how  the  data  collected  from 
all  of  the  teet  runs,  or  treatments,  should  be  handled  in  order  to  deter¬ 
mine  each  effect;  i.e.  ,  to  determine  the  A  effect,  the  test  results  for 
tests  1  to  32  are  added  where  a  plus  sign  is  present  under  column  A  and 
subtracted  for  the  minus  signB.  For  two-factor  interactions,  the  two 
columns  are  first  multiplied  one  by  the  other  and  then  the  data  are  treated 
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in  accordance  with  the  resulting  column,  ihe  re»uli.b  (.an  jc  uutaiacu  m 
a  more  systematic  manner  by  using  the  Yates  Algorithm,  which  consists 
ol  repeatedly  adding  and  subtracting  adjacent  test  results  [2]  until  the 
results  for  mean,  main  effects,  and  two-factor  interactions  are  obtained 
as  shown  in  Table  VI.  All  the  two-factor  interactions  are  measurable 
except  AB,  CE,  AC,  BE,  AE,  and  BC.  As  can  be  seen,  we  can  get  seven 
main  effects  and  six  two-factor  interactions  with  D  (the  bakeout)  plus  the 
mean,  which  allows  eighteen  degrees  of  freedom  for  estimating  error. 

We  expect  that  this  analysis  will  tell  us  the  significance  of  each  main 
factor  and  two-factor  interaction  involving  D  and  thus  allow  us  to  better 
design  an  experiment  that  includes  only  the  important  factors  in  a  full 
factorial  for  a  complete  significant  factor  space  investigation. 

An  investigation  is  now  under  way  searching  for  a  repeatable,  non¬ 
destructive,  performance  criterion  that  can  be  obtained  without  taking 
the  electrodes  to  breakdown.  This  criterion  is  necessary  to  make  meas¬ 
urements  for  the  values  of  voltage  to  be  used  for  the  analysis,  The  areas 
being  investigated  to  find  a  breakdown  criterion  are:  gap  current; 
X-radiation;  gas  evolution  and  gas  analysis;  and  the  spectral  response 
of  visible  radiation  as  a  function  of  voltage.  All  of  these  characteristics 
will  be  continuously  monitored  with  the  hope  that  one  or  all  will  permit 
the  onset  of  breakdown  to  be  predicted.  To  prevent  severe  damage  to 
the  electrodes  and  the  system  in  the  event  that  breakdown  does  occur,  an 
electronic  energy  diverter  will  be  incorporated  in  the  test  setup,  The 
diverter  can  be  triggered  by  a  chosen  level  of  current,  X-ray  output,  or 
the  output  of  a  photomultiplier,  and  can  respond  in  a  micro-second  or 
less  after  a  fault  is  sensed  to  remove  the  voltage  from  the  gap. 

Two  or  three  runs  per  week  will  be  carried  out  according  to  the 
dictate  of  the  inflexible  factors  that  require  change.  Changes  of  the 
inflexible  factors  will  be  made  in  an  ultraclean,  dry  nitrogen,  pressurized 
white-bench  atmosphere.  This  atmosphere  is  monitored  for  dust  particles 
and  water  vapor  content.  The  materials  for  the  electrodes  will  be  certified 
from  a  single  heat  and  will  be  chemically  analyzed  for  recoru  purposes. 

The  electrode  finishes  will  be  obtained  by  precise  polishing  techniques, 
with  prescribed  abrasives  down  to  0.  05  micron  size  particle  finish  for 
the  "fine."  level.  The  electrodes  are  being  constructed  with  Bruce  pro¬ 
files  so  that  the  E  field  is  maximum  in  the  gap.  The  vacuum  pumping 
system  is  an  ultraclean,  oil-free  cryogenic  and  titanium  ion  sputter  system. 
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CONCLUSIONS.  It  is  expected  that  sufficient  information  will  be 
collected  during  these  pilot  experiments  to  permit  elimination  of  factors 
having  minor  effects  and  to  permit  a  more  comprehensive  design  for  the 
final  experiment.  The  initial  32  runs  are  specifically  aimed  at  the  bakeout 
factor,  D;  hopefully  to  eliminate  this  time-consuming  process  in  subsequent 
experiments.  The  final  experiment  will  be  a  full  factorial  using  only  those 
factors  that  are  determined  to  be  significant  in  this  pilot  experiment. 

Results  from  the  pilot  experiment  are  now  being  collected. 

The  techniques  developed  for  this  program  are  applicable  to  other 
studies  in  the  physical  sciences  where  large  numbers  of  variables  of  both 
a  qualitative  and  nonqualitative  nature  are  involved. 
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SOME  INFERENTIAL  STATISTICS  WHICH  ARE  RELATIVELY 
COMPATIBLE  WITH  AM  INDIVIDUAL.  ORGANISM  METHODOLOGY 

Samuel  H.  Revusky 

U.  S.  Army  Medical  Research  Laboratory,  Fort  Knox,  Kentucky 

ABSTRACT ■  A  number  of  new  statistical  techniques  are  described 
which  are  very  sensitive  to  effects  of  an  independent  variable  when  a 
relatively  small  number  of  subjects  are  used  and  the  effects  of  the 
independent  variable  are  irreversible.  The  same  notions  are  generalized 
to  the  case  when  the  effects  of  the  independent  variable  are  reversible. 

INTRODUCTION.  Operant  conditioning  techniques  are  most  useful 
when  a  number  of  experimental  conditions  are  tested  on  a  single  animal. 

It  is  an  empirical  fact  that  within -subject  comparisons  are  far  more 
sensitive  to  small  effects  than  between- subject  comparisons.  Further¬ 
more,  when  within -subject  experimental  manipulations  are  not  made, 
each  S  can  only  contribute  one  data  item  (strictly  speaking)  toward  a 
valid  statistical  analysis  because  of  the  requirement,  central  to  inferen¬ 
tial  statistics,  of  random  sampling.  The  net  result  is  that  splitting  a 
number  of  subjects  into  groups  will  yield  evidence  only  of  very  pronounced 
effects  unless  a  very  large  number  of  Ss  are  used.  The  establishment 
of  a  complex  operant  performance  is  usually  too  time  consuming  and 
difficult  to  permit  use  of  a  large  number  of  Ss,  so  that  statistical  proce¬ 
dures  in  which  each  S  supplies  only  one  data  item  are  usually  not 
practical.  Similar  considerations  are  applicable  to  many  subject  matters 
in  addition  to  operant  conditioning.  For  these  reasons,  as  well  as  some 
others,  single  organism  methodologies  with  within -subject  control*  have 
been  among  the  most  prominent  scientific  techniques. 

But  the  use  of  a  number  of  scores  from  a  single  3  as  separate  inputs 
into  statistical  tests  does  not  rigorously  adhere  to  the  assumptions 
involved  in  inferential  statistics  when  the  independent  variable  (IV)  has 
irreversible  effects,*  Examples  of  such  IVs  are  x-irradiation  and 

^By  irreversible  effect  we  mean,  in  the  present  context,  the  case  in  which 
baseline  data  cannot  readily  be  recovered  after  the  IV  is  administered. 

It  is  possible  to  compare  performance  after  the  IV  with  baseline  perform¬ 
ance,  but,  for  each  S,  only  one  statistically  independent  data  item,  such 
as  a  difference  score,  can  be  used  rigorously  as  input  into  statistical  tests 
of  the  type  in  general  use.  This  is  because  the  data  obtained  after  intro¬ 
duction  of  the  IV  always  follows  the  baseline  data  so  that  the  sampling  can¬ 
not  be  random.  Of  course,  one  may  decide  (legitimately,  I  think)  that  such 
violation  of  random  sampling  will  be  of  little  practical  importance  in  some 
particular  instance. 


~i  rtrt 
-*  w  v 


of  Tirr>d‘n^ ? 

surgery;  in  certain  contexts,  drive  operations  and  novel  stimuli  may  also 
be  considered  irreversible.  Thus,  it  would  seem,  at  first  glance,  that 
the  use  of  difficult  individual  organism  techniques  is  usually  impractical 
when  irreversible  IVs  are  studied  and  assessment  of  the  results  is  by 
means  of  inferential  statistics.  But  due  to  a  recent  development  in 
statistical  methodology  (Cronholm  and  Revusky,  1965),  such  assessment 
is  not  as  impractical  as  it  used  to  be.  The  reason  is  that  statistically 
rigorous  inference  about  irreversible  IVs  may  be  made  with  a  smaller 
number  of  Ss  then  hac,  hitherto  been  feasible.  First,  1  will  describe 
the  basic  idea  underlying  the  Cronholm  -  Revusky  paper  in  concrete  and 
intuitive  form,  and  then  I  will  extend  some  of  its  notions  to  more  complex 
experimental  designs. 

THE  METHOD.  Suppose  6  Ss  are  trained  to  final  performance 

on  a  complex  schedule  of  reinforcement  and  we  wish  to  assess  the  effects 
of  a  novel  stimulus  on  this  performance.  Since  exposure  to  a  novel 
stimulus  has  irreversible  effects,  (in  the  weak  sense  that  after  the  first 
exposure  the  stimulus  no  longer  is  novel),  conventional  experimental 
design  requires  that  we  randomly  divide  the  6  Ss  into  groups  of  3  each, 
subject  one  group  to  the  novel  stimulus  and  the  other  group  to  a  control 
procedure.  For  analytic  purposes,  we  shall  always  refer  to  rank  tests 
and  in  the  present  example,  the  rank  test  to  be  used  would  be  the  Mann- 
Whitney  U  Test  (Siegel,  1956).  With  this  test,  the  total  number  of 
possible  "(and  equiprobable)  outcomes  is  61/(3!  3!)  =  20  and  the 
probability  of  the  most  extreme  outcome  is  3!  3 !  / 6 =  l/20  =  .05.  Thus, 
the  maximum  significance  level  obtainable  with  6  Ss  and  a  conventional 
experimental  design  is  .  05  (one-tailed).  Only  with  extremely  pronounced 
effects  would  it  seem  intuitively  reasonable  to  study  any  hypothesis  with 
less  than  10  Ss,  and  for  many  operant  conditioning  procedures,  this  is 
an  impracticably  large  number  of  Ss. 

With  the  same  6  Ss  a  result  significant  beyond  the  one-tailed.  .  002 
level  is  possible,  if  the  following  technique  described  by  Cronholm  and 
Revusky  is  used.  First,  administer  the  novel  stimulus  to  one  randomly 
selected  S  and  the  control  procedure  to  the  remaining  5  Ss.  Rank  the 
performance  of  the  experimental  S  with  respect  to  the  5  controls.  Thus, 
the  statistical  outcome  of  this  procedure  (which  may  be  called  a  sub- 


No  claimis  made  here  that  all  results  must  be  assessed  by  means  of 
inferential  statistics. 
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experiment)  is  a  rank  from  1  to  6,  We  now  have  5  Se  which  have  not 
been  exposed  to  novelty.  Randomly  select  one  for  exposure  to  the 
novel  stimulus  and  rank  it,  as  before  with  respect  to  the  4  controls. 
This  rank  will  be  between  1  and  5.  Now  continue  this  process  until  one 
S  remains;  this  last  S  will  receive  a  rank  of  1  regardless  of  what  it 
does.  Table  1  is  a  precis  of  the  procedure. 


TABLE  1 

Precis  of  the  experimental  design  used  with  R^.  Each  line 

contains  the  possible  outcomes  of  one  sub -experiment. 

The  sub-experiments 

are  numbered  in  chronological  order. 

Sub -experiment 

Possible,  equiprobable,  ranks 

1 

1.  2,  3,  4,  5,  6 

2 

1,  2,  3,  4,  5 

3 

1,  2,  3,  4 

4 

1,  2,  3 

5 

1,  2 

6 

1 

Since  the  total  number  of  outcomes  in  each  sub-experiment  is 
equal  to  the  number  of  possible  ranks,  the  total  number  of  outcomes 
over  all  6  sub-experiments  shown  in  Table  1  is  equal  to  the  product 
of  the  number  of  equiprobable  outcomes  for  each  sub-experiment;  that 
is,  6x5x4x3x2xl  =  6.'  =  720.  It  is  this  large  number  of  outcomes, 
compared  to  the  20  possible  outcomes  of  the  Mann  •  Whitney  U  with 
6  Ss,  which  is  the  secret  of  the  remarkable  sensitivity  of  the  procedure 
we  are  describing. 

Now  we  will  determine  the  chance  distribution  of  the  results  of 
the  sub-experiments  so  that  results  obtained  by  this  procedure  can 
be  subjected  to  inferential  statistical  analysis.  Chance  is  defined  to 
mean  that  the  random  selection  of  the  experimental  S  in  each  sub¬ 
experiment  alone  determines  the  probability  of  any  rank  outcome;  in 


'  '?r".*rrr' 


302 


Design  of  Experiments 


other  words,  the  novel  stimulus  is  assumed  to  nave  absolutely  no  effect 
on  what  is  measured.  Given  this  definition  of  chance,  in  each  sub¬ 
experiment  each  of  the  possible  outcomes  is  equally  probable.  Thus, 
in  sub-experiment  1,  each  rank  from  1  to  6  has  a  probability  of  1/6.  In 
sub-experiment  2,  each  possible  rank  has  a  probability  of  1/5.  And  so 
on.  A  physical  model  of  the  chance  distribution  may  make  it  clearer. 
Sub-experiment  1  is  similar  to  the  toss  of  a  true  die  and  the  rank  out¬ 
come  is  equivalent  to  the  number  of  pips  which  appear.  Sub-experiment 
2  is  the  to  8  s  of  a  five -sided  die,  with  a  different  number  of  pips  (from 
1  to  5)  on  each  side.  And  so  on. 

Under  this  assumption,  each  sub-experiment  may  be  said  to  have  a 
probability  generating  function  of  its  own,  which  is  of  no  intrinsic 
interest,  but  is  necessary  for  the  understanding  of  the  probability 
generating  function  of  R^,  as  well  as  the  other  statistics  to  be  described 

in  this  paper.  When  k  is  the  number  of  possible  ranks,  this  function  is 


The  coefficient  of  the  ith  power  of  £  in  thiB  function  is  equal  to  the 
probability  that  the  rank  obtained  in  the  sub -experiment  will  be  equal 
to  £  has  no  numerical  meaning  and  itB  only  function  is  to  supply  a 
place  for  the  exponent _i,  which  indicates  the  outcome  for  which  the 
coefficient  of  tr  is  the  probability.  For  instance  if  k  =  5,  the  function 
is 


l  1a1213x14 
-  B  +  +  +  ?s  + 


1  5 
5  8 


This  function  means  that  each  rank  from  1  to  5  has  a  probability  of  1/ 5 . 

The  statistic  to  be  used  to  evaluate  the  probability  of  the  entire 
series  of  sub-experiments  is  simply  the  sum  of  the  ranks  obtained  in 
each  sub-experiment  (called  R^).  To  find  the  generating  function  of 
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R^,  we  multiply  together  the  generating  functions  for  each  sub-experiment. 
For  instance,  when  n  =  6,  we  have 


,  2 

is1  ) 

6 

I  k  .1 

£  s1 

2s  \  I 

1  11 

U-1  M 

i=i 

k=l 

4=1  ' 

\  2  1 

1  i 

6! 

I  will  clarify  the  meaning  of  this  generating  function  by  multiplying  it 
out  and  then  explaining  it  ;  the  more  formally  inclined  reader  may  con¬ 
sult  Cronholm  and  Revusky  (1965),  where  an  intuitively  less  understandable 
but  easier  to  use  version  of  this  generating  function  is  explained. 


+  5s? 

1 

14s8  , 
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21 

.  » 
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T 
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+  720 

+  720 
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In  the  above  expansion,  the  coefficient  of  any  power  of  a  is  equal  to  the 
chance  probability  that  the  value  of  R^  equal  to  that  power  will  occur. 

More  specifically,  the  coefficient  is  a  fraction,  the  numerator  of  which 
is  equal  to  the  number  of  outcomes  which  result  in  the  corresponding 
value  of  R^  and  the  denominator  of  which  is  equal  to  the  total  number 

of  possible  outcomes.  The  probabilities  shown  are  not  cumulative.  To 
obtain  the  cumulative  probability  the  probabilities  of  all  more  extreme 
events  must  be  added  to  the  probability  of  the  event  itself.  For  instance, 
the  probability  that  R^=  8  is  14/720;  the  probability  that  <  8  is 

1/720  +  5/720  +  14/720  =  20/720.  It  i s  apparent  that  the  smallest 
possible  value  of  R^,  6,  has  a  probability  of  1/720,  as  against  a  small¬ 
est  possible  probability  of  l/20  for  a  U  test  utilizing  the  same  number 
of  Ss. 
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Cronhoim  and  Revusky  {i9Cj}  nave  supplied  c.  detailed  de*c  r’pti''»n 
of  the  properties  of  R^,a  rigorous  discussion  of  its  sensitivity  to  small 

effects  as  compared  with  the  Wilcoxon  T  (which  is  functionally  identical 
to  the  U  test),  and  a  table  suitable  for  practical  use  of  the  statistic  with 
up  to  12  Ss.  They  also  discuss  when  the  R^  procedure  should  and  should 

not  be  used,  as  well  as  its  use  as  a  quasi 'Sequential  test.  One  matter  of 
particular  importance  to  operant  conditioners,  is  that  one  can  use  such 
measures  as  percentage  change  in  each  sub-experiment  without  affecting 
the  chance  distribution;  this  permits  a  correction  for  the  base  line  of 
each  S.  This  will  be  true  of  all  the  tests  to  be  mentioned  in  this  paper, 
as  well  as  most  common  statistical  tests. 

PURPOSE  OF  EXTENSION  OF  THE  R  METHOD.  The  basic  idea 

- - - n  —  ■  - . 

underlying  R^,  the  use  of  a  number  of  sub-experiments  each  containing 

one  experimental  S  and  a  number  of  controls,  can  generate  a  large 
number  of  statistical  techniques  more  compatible  with  a  single-organism 
methodology  than  conventional  statistics.  Unfortunately,  in  practice,  the 
experimenter  will  have  to  supply  his  own  probability  generating  function 
if  he  must  depart  from  a  straightforward  use  of  R^  ,  because  the  num¬ 
ber  of  possible  variations  on  the  basic  procedure  is  huge.  The  tedium 
of  computing  generating  functions  is  partially  compensated  for  by  the 
ease  with  which  the  statistics  can  be  computed.  The  remainder  of  this 
paper  will  consist  of  examples  of  statistics  tailored  for  particular 
experiments  in  the  hope  that  they  will  be  a  guide  for  anybody  who  has 
special  needs  to  be  filled.  The  rationale  for  this  unusual  procedure  is 
that  it  increases  the  flexibility  of  the  experimenter's  attack  on  the 
subject  matter. 

A  VARIETY  OF  LEVELS  OF  THE  IV;  ONE  LEVEL  STUDIED  IN 
EACH  SUB-EXPERIMENT.  Suppose  we  are  studying  the  affects  of  a 
poison  on  stabilized  performance  and  wish  to  use  3  dose  levels.  We 
are  willing  to  assume  that  the  direction  of  the  effects  does  not  change 
as  a  function  of  dose  level;  for  example,  if  one  dose  level  either 
improves  or  interferes  with  performance,  any  of  the  other  dose  levels 
to  be  used  either  will  do  the  same  or  will  have  no  effect.  If  it  is 
reasonable  to  suppose  that  one  dose  level  improves  performance  and 
a  second  dose  level  interferes  with  it,  the  present  type  of  analysis 
makes  no  sense,  although  modifications,  to  be  mentioned  later,  may 
be  made  for  such  situations. 
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We  begin  with  10  Ss  under  a  .?{  the  R  procedure 

—  n 

described  by  Table  2.  The  change  in  our  procedure  is  that  for  each 
sub -experiment  one  of  the  3  dose  levels  is  used  for  the  experiment  S. 


TABLE  2 

The  experimental  procedure  by  which  the  statistic  is  used 
to  study  the  effects  of  3  levels  of  an  IV. 


Chronological  order  is  from  top  to  bottom. 


Sub  -  e  xpe  r  iment 

Dose  Level 

Possible 

,  equip robable ,  ranks 

1 

A 

1, 

2, 

3, 

4, 

5, 

6. 

7, 

8,  9,  10 

2 

B 

1, 

2, 

3, 

4, 

5, 

6, 

7, 

8,  9 

3 

C 

1, 

2, 

3, 

4, 

5, 

6, 

7, 

8 

4 

A 

1. 

2, 

3, 

4, 

5, 

6, 

7 

5 

B 

1. 

2, 

3, 

4, 

5, 

6 

6 

C 

1, 

2, 

3, 

4, 

S 

7 

C 

1, 

2, 

3, 

4 

8 

B 

1, 

2, 

3 

9 

A 

1, 

2 

Thus,  we  select  3  sub -experiments  to  test  each  of  the  3  dose  levels; 
we  do  not  use  the  last  sub-experiment  for  purposes  of  statistical  infer¬ 
ence  because  its  outcome  is  predetermined.  To  assess  the  probability 
of  the  overall  effect,  we  simply  use  R^,  ignoring  the  individual  dose 

levels.  To  obtain  a  separate  statistic  for  each  dose  level,  we  add  the 

ranks  obtained  in  the  sub-experiments  in  which  that  dose  level  was 

■used.  Thus  for  dose  levels  A,  B  and  C,  we  have  rA,  r  and  r_. 

A  B  C/ 

The  generating  functions  for  each  of  these  3  statistics  are  straighT- 
forward.  Consider  r^  and  remember  our  physical  analogy.  Table  2 


l 

r 

f 


t 
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analogy  for  each  of  these  sub-experiments,  respectively  is  a  ten- Bided 
die.  a  seven-sided  die,  and  a  two-sided  die.  Thus,  the  probability 
generating  function  of  r^  may  be  constructed  much  like  the  probability 

generating  function  for 


10  i 

2  a 

L‘ 

i=l 

1=1 

i=l 

10  . 

7  •  2 

where,  as  before,  the  coefficient  of  any  power  of 

probability  that  a  sum  of  ranks 

equal  to  that  powe 

For  similar  reasons,  the  generating  function  for 

2  s1 

isl 

i=l 

i=l 

9  • 

6*3 

and  the  generating  of  rc  is 

8  , 

2  s1 

2  s1 

i=l 

i=i 

i=l 

B 


is 


Inspection  of  the  denominators  of  these  3  generating  functions,  shows 
140  possible  outcomes  for  dose  level  A,  162  outcomes  for  B  and  160 
outcomes  for  C.  I  contrived  the  sequence  of  administration  of  the  dose 
levels,  so  that  the  number  of  outcomes  for  each  dose  level  would  be  as 
nearly  equal  as  1  could  make  it  in  the  hope  that  the  statistical  power  at 
each  dose  level  would  then  be  similar.  Of  course,  this  may  not  be 
desirable  in  some  cases. 

The  net  result  is  that  in  the  above  example,  the  significance  of 
an  overall  effect  can  be  determined.  Given  a  significant  overall  effect, 
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the  significance  of  the  effect  at  each  dose  level  can  be  determined. 

u n£o rtunciteiy ,  However,  in  ere  i o  uu  icaouna ul y  puwci  lui  to 

assess  differences  between  the  effects  of  the  different  dose  levels.  The 
best  that  can  be  done  is  to  use  the  Kruskall-Wallis  one-way  analysis 
of  variance  (Siegel,  1956)  to  compare  the  magnitude  of  the  effects  at 
different  dose  levels;  the  input  into  this  test  is  all  the  experimental 
scores  and  none  of  the  control  scores;  the  assumption  is  made  that  the 
effects  do  not  change  over  sub-experiments. 

Still  more  statistical  sensitivity  may  be  obtained  with  the  above 
procedure  if  some  results  may  be  discounted  before  the  data  are 
collected.  An  analogy  from  conventional  statistics  is  the  one-tailed 
test  in  which  the  experimenter  is  so  certain  that  the  results  will 
occur  in  only  one  direction,  that  he  is  willing  to  state  that  any  result 
in  the  opposite  direction,  no  matter  how  extreme,  is  a  sampling  error. 
Similarly,  in  the  present  case,  we  may  be  entirely  certain  that  if  any 
effect  exists,  dose  level  A  (the  lowest  level)  will  have  the  smallest 
effect  and  dose  level  C  (the  highest  level)  will  have  the  largest  effect. 

If  we  are  willing  to  assert  that  any  other  result  is  due  to  chance,  we 
may  divide  our  obtained  probability  levels  by  l/6  because  there  are 
31  =6  possible  permutations  of  the  results  obtained  for  the  3  dose 
levels,  and  we  are  assuming  only  one  of  these  6  possible  outcomes 
can  be  non-chance.  Alternatively,  we  may  also  accept  a  significant 
result  if  A  has  the  largest  effect  and  C  has  the  smallest  effect,  in  which 
case  the  probability  level  may  be  divided  by  3  since  2  of  the  6  possible 
permutations  are  acceptable  as  not  due  to  chance.  Of  course,  if  the 
data  seem  to  clearly  contradict  one's  preconceptions;  one  is  in  the 
unenviable  position  of  discarding  data  not  because  of  anything  in  nature 
but  because  of  the  foolishness  of  his  a  priori  notions.  On  the  other 
hand,  if  one  does  accept  the  unexpected  result  as  not  due  to  chance,  the 
true  probability  of  rejection  of  the  null  hypothesis  at  the  chance  .  05  level 
will  be  .  30  if  only  one  permutation  had  been  expected  and  .  15  if  one  of 
two  permutations  had  been  expected.  I  think  the  best  solution  in  event 
of  an  unexpected  outcome  is  to  repeat  the  experiment  unless  the 
unexpected  result  is  entirely  convincing  without  any  formal  statistical 
evidence  in  its  favor. 

A  NUMBER  OF  LEVELS  OF  THE  IV;  ONE  S  IN  EACH  LEVEL 
TESTED  IN  EACH  SUB -EXPERIMENT.  The  preceding  application 
included  9  sub-experiments.  A  variant  on  this  procedure,  also  utilizing 
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10  Ss,  permits  a  reduction  to  3  sub -experiment  a  as  follows;  (a)  Sub- 
experiment  1.  Beginning  with  10  Sc,  randomly  assign  1  S  to  each  dose 
level  and  utilize  7  controls,  (b)  Sub -experiment  2.  Of  the  7  controls  of 
sub-experiment  1,  randomly  assign  1  S  to  each  of  the  dose  levels  and  use 
the  4  remaining  Ss  as  controls,  (c)  Sub -experiment  3.  Repeat  the  proce¬ 
dure  with  3  experimental  Ss  and  1  control.  In  this  design,  the  probability 
generating  function  for  each  dose  level  is  straightforward,  but  the 
assessment  of  whether  an  overall  effect  occurred  is  difficult.  Therefore, 
we  will  begin  backward  with  an  assessment  of  the  effects  at  the  separate 
do°e  levels  and  then  we  will  consider  the  overall  effect. 

Consider  dose  level  A.  A  rank  is  obtained  for  each  sub -experiment 
by  ranking  the  subject  receiving  dose  level  with  respect  to  the  controls 
and  ignoring  the  results  obtained  with  levels  B  and  C.  These  ranks  are 
then  summed  over  the  3  sub-experiments.  The  following  probability 
generating  function  is  applicable. 

8  .  5  2  . 

Si  2  s*  2  s 

1=1 _ i^l _ i=l 

8.5.2 

A  similar  statistic  is  obtained  for  levels  B  and  C;  of  course,  their 
probability  generating  functions  are  the  same  as  for  level  A.  It  should 
be  noted  that  the  denominator  of  the  generating  function  shows  80 
possible  outcomes;  when  only  one  experimental  S  was  run  at  a  time 
in  the  otherwise  similar  design  of  the  preceding  section,  the  smallest 
number  of  outcomes  was;  140.  Thus  it  is  evident  that  this  method  reduces 
the  number  of  sub -experiments  needed  in  the  preceding  section  at  the 
price  of  some  power.  Whether  this  price  is  worth  paying  is  up  to  the 
experimenter. 

We  are  now  faced  with  3  statistics  and  the  problem  of  deciding  if 
the  overall  pattern  is  due  to  chance;  obviously  the  probability  that  at 
least  one  of  these  statistics  will  be  significant  at  the  .  05  level  has  a 
higher  chance  level  than  .  05,  which  will  be  taken,  in  this  discussion, 
to  be  the  rejection  level  for  the  chance  hypothesis.  There  are  3  ways 
of  doing  this  and  the  experimeter  must  select  the  most  reasonable  way 
for  his  particular  experiment, before  he  has  seen  the  data.  The  first 
2  of  these  ways  are  also  applicable  to  the  method  of  the  preceding 
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section  in  cases  where  one  dose  level  may  improve  performance  and  a 
second  level  may  interfere  with  it.  Following  are  the  3  ways: 

a.  If  the  result  is  significant  at  the  .  05  level  at  the  highest  dose 
level,  assume  any  other  apparently  significant  results  are  real.  If  it 
is  not,  assume  any  other  significant  results  are  spurious, 

b.  If  each  of  3  statistical  probabilities  were  independent,  one  or 
more  of  the  3  results  would  be  significant  at  the  .  017  level  with  a 
probability  of  .  05.  Since  the  results  are  not  entirely  independent 
because  they  all  depend  on  the  same  control  scores,  a  conservative 
guess  at  the  chance  level  is  .  02.  If  one  of  the  3  results  has  a  chance 
probability  below  .02,  regard  any  other  results  significant  at  the  .05 
level  as  non- chance. 

c.  Combine  all  3  dose  levels  for  each  sub- experiment  and  regard 
it  as  the  comparison  of  an  experiment  with  a  control  group,  Then,  for 
each  sub-experiment,  obtain  a  probability  level  by  some  conventional 
test;  the  Mann-Whitney  IJ  test  {Siegel,  1956)  would  be  very  consistent 
with  our  other  teste  because  it  is  a  rank  test.  Then  combine  the  3 
obtained  probabilities  by  means  of  the  z-transformatlon  (Mosteller  and 
Bush,  19541.  If,  and  only  if,  the  combined  probability  level  is  below 

.  05,  there  is  a  significant  overall  effect.  If  this  method  is  to  be 
sensitive,  it  must  be  reasonable  to  suppose  that  all  dose  levels  act 
in  the  same  direction  on  the  performance.  3  Because  U  is  a  discrete 
distribution,  the  combined  probability  will  be  conservative. 

THE  CASE  WHERE  THE  EFFECTS  OF  THE  IV  ARE  REVERSIBLE. 
So  far,  we  have  dealt  with  cases  in  which  the  Ss  are  irreversibly 
affected  by  the  IV,  because  this  is  the  situation  in  which  the  new 
statistical  method  makes  a  unique  contribution.  Nevertheless,  an 
extension  in  which  a  subject  is  used  for  control  data  after  it  has  been 
subjected  to  the  IV  may  be  of  interest  to  some  experimenters,  par¬ 
ticularly  psychopharmacologists. 

t 

Suppose  there  are  n  subjects.  On  each  of  k  occasions,  one  S  is 
randomly  selected  for  the  experimental  treatment  and  the  remaining 


It  is  cautioned  that  combination  of  the  probabilities  obtained  for  each 
dose  level  is  not  valid,  strictly  speaking,  because  the  same  control 
scores  are  used  for  each  dose  level. 
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Ss  are  used  as  controls.  For  the  foregoing  material  to  be  rigorous,  it 
is  necessary  that  the  selection  be  entirely  al  random,  even  if  it  results 
in  the  same  S  being  administered  the  experimental  treatment  on  each  of 
the  k  occasions.  The  probability  generating  functions  for  the  sum  of  the 
ranks  obtained  by  the  experimental  Sb  ie 


k 


n 


Irreversible  effects  will  not  affect  the  statistical  validity  of  any  rejection 
of  the  null  hypothesis,  although  the  sensitivity  of  the  test  will  be  reduced, 
so  that  It  is  only  necessary  that  the  effects  of  the  IV  be  reversible 
enough  so  that  a  significant  result  is  conceivable  and  will  make  scientific 
sense. 

Now  consider  a  concrete  example.  There  are  4  Ss,  each  trained  to 
a  high  performance  criterion.  On  each  of  the  8  occasions,  one  of  these 
Ss  is  randomly  selected  for  drug  administration  and  the  remaining  3  Ss 
act  as  controls.  The  probability  generating  function  looks  like  this! 


g 

The  denominator  of  the  above  function,  4  =  65,536,  is  the  number  of 
possible  outcomes.  I  hope  the  reader  shares  my  intuition  that  thi6  huge 
number  is  indicative  of  remarkable  sensitivity  to  small  effects, 

Because  of  this  large  number  of  outcomes,  the  probability  generat¬ 
ing  function  discussed  in  the  preceding  two  paragraphs  cannot  usually 
be  computed  except  by  an  electronic  computer.  Fortunately,  both 
editions  of  Feller's  (1950,  1957)  textbook  on  probability  theory  include 
equations  for  the  chance  probability  of  any  sum  of  ranks  under  this 
procedure.  For  the  1950  edition:  examples  11  and  12  on  page  236  with 
necessary  background  on  pages  40-41.  For  the  1957  edition:  examples 
18  and  19  on  page  266  with  necessary  background  on  pages  48-49. 
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As  already  mentioned,  if  the  use  of  the  statistic  is  to  be  mathematiV 
cally  rigorous,  the  experiment  S  to  be  used  in  each  sub-experiment  must 
be  selected  entirely  at  random  so  that  some  Ss  may  receive  the  experi¬ 
mental  treatment  more  often  than  others.  From  an  experimental  view¬ 
point,  however,  it  usually  seems  more  desirable  to  administer  the 
experimental  treatment  in  &  restricted  random  sequence  in  which  no  S 
receives  the  treatment  a  second  time  until  all  Ss  have  received  it  once. 

My  preference  is  for  use  of  restricted  randomization  and  1  expect,  without 
solid  proof,  that  its  effects  are  to  reduce  the  probability  of  a  significant 
result  due  to  chance.  If  the  experimenter  prefers  statistical  rigor  and 

still  wishes  to  u6e  restricted  randomization,  he  may  use  the  R  procedure. 

n 

Of  course,  in  this  case,  discarded  Ss  will  simply  be  ignored  for  statis¬ 
tical  purposes  and  may  remain  in  training.  After  all  Ss  have  received 
the  experimental  treatment,  the  group  can  be  reinstated  and  another 

procedure  be  administered.  Cronholm  and  Revusky  (1965)  describe 
how  a  joint  generating  function  can  be  obtained  for  a  number  of  R^ 

experiments. 

There  are  other  usable  statistical  methods  for  reversible  effects  and 
1  am  not  sure  the  present  method  is  better.  It  has  been  mentioned  with 
reference  to  the  effects  of  drugs  on  behavior  because  it  permits  a  great 
deal  of  sensitivity  with  a  low  frequency  of  drug  injection.  Furthermore, 
computation  of  the  statistic  is  almost  instantaneous.  If  it  happens  to  be 
useful,  it  can  be  elaborated  much  as  procedures  for  irreversible  effects 
have  been  elaborated  in  this  paper.  For  instance,  in  the  case  we  used 
as  an  example,  4  sub-experiments  can  be  administered  at  one  dose  level 
and  4  sub-experiments  at  a  second  level. 
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CONTROL  OF  DATA-SUPPORT  QUALITY 


F red  S.  Hanson 

Plans  and  Operations  Directorate 
White  Sands  Missile  Range,  New  Mexico 

ABSTRACT,  The  need  for  businesslike  management  of  range -user 
support  is  a  requirement  forquality  control.  Required,  or  committed, 
levels  of  quality  and  reliability  largely  determine  cost  of  support  and 
value  of  the  services.  Measurement  support  is  the  best  area  to  start  a 
Range  quality-control  program.  Evaluation  support  is  an  easier  place 
to  start  formal  control  than  real-time  support.  In  this  frame  of  reference, 
quality  is  the  technical  level  --  accuracy  and/or  precision  --  of  data 
support.  The  problem  of  specifying  data  quality  has  been  largely  resolved. 
The  statistical  control  chart  for  the  standard  deviation  can  be  directly 
carried  over  to  the  flight-measurement  operation.  The  Ranges  have 
available  a  sufficient  basis  for  operating  control  --  and  for  some  of  the 
user's  needs  --  in  the  precision  of  observations  and  of  data.  It  appears 
that  quality  assurance  for  everything  is  not  necessary,  so  far  as  data- 
support  contractors  are  concerned.  A  single  number  (average  precision) 
can  serve  as  an  index  of  technical  level  of  support  performance  --  for 
control  of  resources  and  for  long-term  planning.  An  approach  to  technical 
validation  of  measurement  requirements  is  proposed. 

INTRODUCTION.  By  definition,  some  technical  criteria  are  neces¬ 
sary  to  efficient  management  of  technical  operations.  In  the  case  of  a 
missile  range,  the  keys  to  some  of  these  criteria  lie  in  the  discipline  of 
data  analysis  --  which  is  the  hardest  place  for  Management  to  get  them 
out. 

BACKGROUND,  Almost  two  years  ago,  White  Sands'  Range  Opera¬ 
tions  Directorate  appointed  a  Quality  Assurance  Committee  --  because 
the  formal  organisation  had  failed  to  develop  adequate  quality  control. 

(The  writer  serves  as  Chairman,)  The  Committee  engaged  a  consultant, 
thru  ARO(D)  --  Charles  Bicking,  who  once  worked  with  General  L.  E. 
Simon. 

Figure  1  shows  a  (missile)  range  as  a  system .  The  input  is  from 
the  range  user.  Support  may  be  represented  as  a  transfer  function. 

The  idealized  diagram  shows  open  two-way  communication,  within  the 
support  function.  The  output  is  to  the  user. 
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A  need  of  this  system  is  businesslike  management  ol  user  support  -- 
to  make  the  output  match  the  input  --  and  to  minimize  the  cost  of  the 

*  ran<(»v  t'lmrtinn  in/'lort**  manaounpnt  tnlus  a  oonfllv  sharp  nf 

—  -  -  —  «  ..  o  '  i  o  / 

the  overhead).  This  paper  shows  the  extent  to  which  thiB  need  is  a  require 
ment  for  quality  control, 

DEFINITIONS,  Quality  is  how  well  and  how  good,  Broadly,  quality 
is  any  desirable  characteristic  of  process  or  product  -  other  than  sheer 
quantity  or  rate. 

The  viewpoint  that  a  missile  range  need  be  concerned  only  with  pro¬ 
duction;  that  exactly  what  it  turns  out  is  less  important;  and  that  how  good 
this  is  is  scarcely  worth  mentioning  is,  of  course,  not  rational.  However, 
pressures  to  meet  deadlines  •  and  limitations  of  resources  of  all  kinds  - 
tend  to  reduce  a  Range  to  this  viewpoint. 

Reliability  is  how  often  --  either  within  a  test  or  among  tests.  It  is 
%  success. 

Support  reliability  is  -  strictly  speaking  -  a  production  characteristic. 
However,  hardware  reliability  is  sustained  quality  -  of  the  hardware.  So, 
as  a  discipline,  reliability  is  found  with  quality.  This  paper  considers 
reliability  control  common  to  production  control  and  quality  control,  for 
missile  ranges. 

Required,  or  committed,  levels  of  quality  and  reliability  largely  deter 
mine  cost  of  support  and  value  of  the  services,  So,  can  a  Range  have  an 
economical,  consistently-valid  support  operation  without  (some  form  of) 
quality  control  ? 

This  paper  considers  the  distinction  between  quality  assurance  and 
quality  control  to  be  a  matter  of  degree.  Assurance  is  broader  --  more 
stafflike . 

This  paper  defines  statistical  quality  control  -  Industrial  quality  con¬ 
trol  -  as;  closed-loop  control  of  operations.  It's  emphasis  is  on  formula- 
tive  and  evaluative  control  actions,  As  a  separate  discipline,  or  function, 
quality  control  is  taken  (in  Figure  2)  to  comprise:  specification,  score - 
keeping,  feedback,  and  followup,  Let 's  explore  quality  control,  itself, 
and  each  of  these  phases  (in  relation  to  a  range). 
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QUALITY  CONTROL.  The  (missile)  Ranges  tend  to  overemphaB1 2e 
measurement  --  somewhat  at  the  expense  of  ether  support.  I  th’n.k  •■'h*'* ■» 
(emphasis)  is  partly  due  to  the  cost.  But,  it's  mostly  that  data  support 
is  a  focus  of  confusion  (and  difficulty).  Measurement  is  the  best  place  to 
ctavt  a  (support)  quality-control  program.  Because  it  offers  a  big  pay-off 
(thr-  more  economic  control  of  resources  and  of  planning);  because  it  is 
a  key  to  the  technical  level  of  the  missile  effort;  and  because  it  lends 
itself  directly  to  conventional  (statistical)  quality  control  --as  will  be 
eh^  >vn. 

Evaluation  support  is  an  easier  place  than  real-time  support  to  start 
formal  quality  control.  Because  the  data  holds  still  -  and  sits  around  • 
during  postflight  reduction.  And  there's  less  sound  and  fury  connected 
with  it. 

An  overland  missile  range  is  ideal  for  (pioneering)  statistical  quality 
control  of  flight  data  --  because  it  has  an  unlimited  number  of  possible 
locations  for  instruments. 

In  this  frame  of  reference,  quality  is  the  technical  level  of  a  range's 
(daily-operating)  data  support  --  evaluated  against  the  corresponding 
requirement.  Quality  (level)  is  the  percent  to  which  a  particular  (quality) 
requirement  is  met. 

A  range  may  need  other  things  as  much  •  or  more  -  than  it  needs 
quality  control.  For  instance;  standard  operating  procedures  for 
instrumentation;  reliability  control;  an  integral  production-control 
system.  The  National  Ranges  have  to  work  on  all  of  these. 

SPECIFICATION.  As  this  paper  sees  it,  specification  is  the  corner¬ 
stone  of  quality  control.  A  spec,  is  a  practically  foolproof  (and  knave- 
proof)  description  of  an  item  or  service.  It  is  the  standard  that  tells 
what  counts  as  a  goal  -  in  the  particular  game.  It  has  to  be  definite,  and 
quantitative . 

*  In  specifying  measurement  quality,  one  should  ask  the  question; 

"What  do  we  mean,  'accuracy'?" 

Suppose  a  user  has  furnished  the  characteristics  of  his  vehicle  - 
and  its  proposed  trajectories.  Assume  that  a  missile -performance 
variable  (to  be  measured)  has  been  identified;  and  the  desired  units, 
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and  coordinate  system,  have  been  specified.  It  takes  about  nine  more 
questions  to  "pin  down"  i'uc  u»i  "accuracy"  requirement. 

Figure  3  shows  the  elements  -  demonstrably  -  required  to  specify 
the  "accuracy"  of  flight  measurements: 

1.  What  part  of  the  trajectory?  (trajectory  phase) 

2.  At  what  intervals  (do  you  want  data)?  (reporting  interval) 

3.  Is  this  accuracy  or  precision?  (quality  characteristic) 

The  user  could  be  stating  the  allowable  discrete  error  (of  the  data)  with 
respect  to  his  (preferred)  coordinate  system.  Or,  he  could  be  stating 
the  allowable  inconsistency  of  the  data  with  itself. 

4.  Does  this  number  apply  to  the  vector  or  to  a  component? 

(mode  or  representation) 

(If  he  says  "component",  there  is  a  second  question:  Is  the  requirement 
the  same  for  each  component?) 

5.  What  %  (of  the  data)  must  be  within  this  tolerance  ?  (probability 
level  --  %  compliance) 

6.  Precision  (or  accuracy)  or  exactly  what?  (data  phase) 

i.  e.  ,  What  stage  of  the  measurement-computation-analysis  process  it 
being  characterized? 

7.  What  is  the  "operational"  basis  of  the  quality  characteristic? 
(quality  criterion) 

In  other  words,  what  sort  of  procedure  is  (to  be)  employed  to  obtain 
this  precision  (or  accuracy)  figure? 

8.  Over  what  interval  do  you  want  the  precision  (or  accuracy) 
to  average  out  to  the  requirement?  (lot  size) 

Question  5  was;  What  %  (of  the  time)  do  you  want  the  data  (to  be)  within 
the  stated  tolerance?  If  the  user  said  "68%  of  the  time",  the  present 
question  is:  68%  of  what  time  --  what  is  the  minimum  lot  size  to  which 
the  spec,  applies?  (What  constitutes  an  acceptance  lot?) 

9.  How  much  variation  is  acceptable  within  a  lot  ?  (variability 

w/in  lot) 

Of  course,  this  is  already  reflected  in  the  lot-average  tolerance. 
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10. 
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Finally,  what  support -reliability  (level)  will  you  accept? 
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neither  the  users  nor  the  Ranges  are  ready  for  quantitative  specification 
of  data-support  reliability. 


People  go  around  saying  "accuracy".  Figure  3  shows  the  kinds  of 
uncertainty  implicit  in  that  word  -  when  applied  to  flight  measurement. 

It  turns  out  -  if  one  says  accuracy  without  further  qualification  -  the 
uncertainty  as  to  what  is  meunt  can  be  as  large  as  15_  or  ljb  times  the 
requirement.  This  was  shown  in  the  writer's  paper  (Ref.  1)  at  the  1963 
Army  Operations  Research  Symposium.  People  should  be  more  scientific 
than  being  away  from  what  they  are  dealing  with  by  a  factor  of  15  or  16  - 
as  a  matter  of  pride.  Also,  the  taxpayer  can't  afford  to  have  the  Ranges 
spend  his  dollars  so  vaguely. 

These  elements  of  a  spec,  apply  to  all  performance  variables.  Actually, 
they  cover  any  quantity  --  no  matter  how  obtained.  Asking  these  questions, 
of  the  user  was  an  expository  device.  They  could,  just  as  well,  be  asked 
of  a  range  -  regarding  its  capability.  In  practice.  White  Sands  has  built  a 
sufficient  basis,  and  a  standard  basis,  for  a  measurement -quality  spec, 
into  its  user -document  formats  --  with  the  door  left  open  for  (the  user  to 
state)  a  different  basis. 


WSMR's  standard  basis  is; 

Quality  characteristic  --  precision 


Mode  of  representation  --  component 
Probability  level  --  68% 

Data  phase  --a  single  value  of  the  missile  variable  -  at  a  given  point 
in  time  -  in  component  form 

Quality  criterion  --  propagation  of  error  (from  the  previous  data 
phase) 

Lot  size  --  the  series  of  firings  covered  by  the  requirement  (the 
average  precision  for  that) 

WSMR  wants  to  be  judged  on  the  average  quality  of  "the  whole  trainload 
of  apples".  First,  it  was  necessary  to  state  what  constitutes  an  "apple" 
(data  phase)  in  this  case. 
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Reliability  --in  the  White  Sands  edition  of  the  National  Range  Documen¬ 
tation,  categorising  a  requirement  as  "Mandatory",  "Kequirea ",  o£ 
"Desired"  yields  a  qualitative  judgment  of  the  user's  need  for  reliability 
of  (obtaining  this)  support. 

I 

Because  WSMR's  standard  basis  is  (rath|r)  concisely  stated,  it ' b 
not  foolproof  -  unless  one  (also)  refers  to  the  procedure  for  its  calcula¬ 
tion  (stated)  in  Final  Data  Reports.  It  would  improve  communication  if 
that  were  expressed  in  "English"  -  as  well  as  matrix  algebra.  For 
instance,  WSMR  calculates  the  precision  of  a  single  value  of  a  position 
component  from  the  precision  of  observations  of  (physical)  determinants 
of  that  position.  In  three  dimensions,  and  matrix  algebra,  the  square 
root  of  the  sample  size  is  replaced  by  the  square  root  of:  over  A  • 

It  would  be  desirable  to  spell  out  v/hat  that  means  in  ordinary  algebra  - 
and  ordinary  English. 

It  is  more  in  accord  with  a  search  for  ultimate  purity,  and  more  con¬ 
venient,  to  Bay  that  accuracy  and  precision  should  be  left  in  the  qualita¬ 
tive  realm .  But,  the  Ranges  are  in  business.  So,  they  have  to  go  ahead 
as  beet  they  are  able.  The  writer  has  collected  many  publications  on 
measurement  semantics.  This  paper's  semantic  criteria  are; 

Firat  --  usefulness  (for  the  particular  purpose) 

Second  --  simplicity,  clarity,  ordinary  logic 

Third  --  tradition,  rigor,  abstract  symbolism 

In  the  unsheltered  world,  communication  between  disciplines  is  more 
useful  than  purity  of  discipline.  If  it's  authoritarian,  it's  not  science  - 
anyhow. 

The  writer's  paper  at  the  Tenth  Conference  (Ref,  2)  may  lead  WSMR 
to  (separately)  specify  the  quality  of  measurement  of  the  time  dimension 
of  missile -performance  variables. 

The  economic  goal  is  to  give  the  range  user  exactly  what  he  asks 
for  --  and  not  one  iota  more.  If  the  user  finds  he  needs  more,  he  has 
only  to  ask. 
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SCOREKEEPING.  Standard  statistical  quality  control  can  be  directly 

n<  t  r>  A  afa  -  «n  nrtn  rt  nnorafinn  -  a*  follow*1 
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A  manufacturer  of  "widgets"  will  inspect  a  sample  of  (several)  widgets 
taken  from  production.  The  average  caliper  of  the  sample  will  become 
a  dot  on  a  control  chart  showing  the  level  at  which  his  operation  is  run¬ 
ning.  The  average  variation  (standard  deviation),  from-widget-to-widget 
within  the  sample,  may  become  a  dot  on  a  control  chart  showing  the 
(current)  variability  of  his  process.  Control  of  the  level  of  a  missile- 
performance  variable  is  a  (range-)  user  function.  The  Ranges  can 
directly  carry  over  the  control  chart  for  variability  -  of  their  measuring 
operation-  The  dot  on  a  Range's  chart  can  be  a  statistical  average  of  the 
variability  for  an  entire  (phase  of  a)  trajectory;  because  feedback  control 
can  only  be  from-firing-to-firing  -  of  a  given  type.  Variability  of  the 
measuring  operation  is,  of  course,  precision  of  measurement.  We  are 
talking,  here,  about  using  a  standard  measure  of  final-data  quality  as  an 
overall-performance  index  for  a  data-support  operation  --  besides  using 
it  as  a  consistent  basis  for  user-range  communication  of  (data-support) 
requirements. 

Physical  accuracy  is  important.  But,  the  least  we  can  be  is  con¬ 
sistent.  The  Ranges  have  available  a  sufficient  basis  for  operating  control 
-•  and  for  some  of  the  user's  needs  -•  in  the  (internal)  precision  of  their 
insufficiently-calibrated  data-support  systems.  (Insufficiently-calibrated 
as  systems.  )  White  Sands'  standard  precision  of  position  measures 
consistency  between  (observing)  stations  --  which  contains  a  portion  of 
physical  truth. 

A  few  samples  of  WSMR  scorekeeping: 

1.  Our  consultant,  Mr.  Bicking,  developed  a  control  chart  for 
instrument  and  system  support  reliability.  The  number  of  unusable 
records,  of  a  system,  Is  plotted  directly  from  Data  Reduction's  Field 
Record  Quality  Report  -  without  (the  necessity  of)  calculating  (the)  frac¬ 
tion  defective.  The  horizontal  scale  is  total  number  of  instrument 
operations  (in  a  week).  This  avoids  fluctuating  limits.  So,  the  chart 
can  be  preparedin  advance  --  with  (2-sigma)  limits  which  increase 
smoothly  with  number  operated. 


2.  Figure  4  is  another  control  chart  on  an  intermediate 
"product".  This  is  from  Data  Reduction's  monthly  Data  Quality  Report 
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(Ref.  3).  It  shows  the  (rms-)  average  precision  of  azimuth-angle  obser¬ 
vations  by  each  (Askania-)  cinetheodolitc  station  -  identified  by  number. 
(Ordinate  scale  is  minutes  of  arc.  )  The  precisions  for  August  are  the 
plain  bars.  The  shaded  bars  are  cumulative -average  precisions.  The 
UCL  is  a  3- sigma  control  limit  --  based  on  the  fluctuations  of  che 
cumulative-average  precisions  (cumulative  from  1  January)  about  their 
central  value,  during  the  March-April  period.  (It  should  be  realized 
this  is  3- sigma  of  sigma.  ) 

3.  Let's  look  at  (an  example  of)  overaU  data-support  quality. 
Figure  5  is  from  Data  Reduction's  Data- Quality  Report  for  May  (Ref.  4). 

It  sKows  the  (rms-)  average  precision  of  (cinetheodolite)  position  measure  - 
ment,  in  feet,  for  (the)  Little  Jo;  (component  of  NASA's  Apoilc).  The 
solid  curve  shows  the  average  (data-point  component)  precision  for  each 
round.  The  dotted  curve  is  the  cumulative -average  precision.  For  this 
Project,  the  requirement  and  the  range  commitment  happen  to  bt  the 
same.  In  the  beginning,  Data  Reduction  didn't  use  statistical  control 
limits;  because  WSMR's  greatest  need  was  to  see  where, it  stood  --  and 
what  sort  of  creature  it  was.  There  were  better  -  and  worse  -  charts 
than  Figure  5.  The  main  point  is:  the  quality  of  W3MR  data  support 
can  vary  widely  from-test-to-test.  (Also,  from-month-to -month  -  and 
from-project-to-project. ) 

The  average  precision  of  final  data  is  a  measure  of  support  perfor¬ 
mance.  When  the  user's  requirement  is  valid,  average  precision  is 
(also)  a  measure  of  Range  effectiveness, 

Two  of  WSMR's  operating  chiefs  were  displeaeed  by  the  test-to- 
test  precision  charts.  The  bad  data  was  too  evident,  Since  the  May 
Report,  precisions  for  each  test  have  been  shown  only  in  tables.  Start¬ 
ing  with  the  current  Data  Quality  Report  (Ref.  5),  monthly  and 
cumulative -average  precisions  for  each  project  -  along  with  the 
requirement  and  commitment  -  are  shown  on  bar  graphs, 

One  of  WSMR's  operating  supervisors  suggested  (seriously)  final- 
data  charts  could  be  improved  by  editing  the  input  to  the  average 
precisions  -  at  (about)  75%  confidence.  That's  editing  at  1,15  time 
sigma  (the  variable  being  plotted).  It  would  do  a  groat  deal  for  the 
chart t, ,  but  it  would  nullify  their  usefulness. 
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A  comment  on  (the  problem  01)  monitorial-  &  support  cor.tr^c^oj- 

MIL-Q-9858,  etc.  furnish  guidelines  for  quality  assurance  -  almost  "from- 
womb -to -tomb".  These  are  procurement-oriented  regulations.  When 
procurement  requirements  "cannot"  be  numerically  specified;  compliance 
"cannot"  be  demonstrated  by  test;  or  initial  failure  to  meet  "cannot"  be 
tolerated,  it  is  necessary  to  inspect  a  contractor  for  "everything".  This 
paper  has  shown  a  basis  for  definite,  numerical  specification  of  flight 
measurement  -  and  demonstration  of  compliance.  The  tact  that,  in  the 
past,  the  Ranges  have  not  had  systems  for  reporting  whether  requirements 
were  met  (in  a  technical  sense)  is  evidence  that  initial  failure  to  meet 
can  be  tolerated.  So,  it  appears  that  quality  assurance  for  "everything" 
is  not  necessary  -  so  far  as  data-support  contractors  are  concerned,  A 
single  number  -  average  precision  oi  measurement  -  can  serve  as  an 
index  of  variability  of  (a  given  data-support)  process  and  product;  and  as 
an  index  of  technical  level  of  support  performance  (add  effectiveness)  -- 
for  control  of  resources  and  for  (long-term)  planning. 

FEEDBACK,  Open-loop  control  gives  good  results  only  for  very 
simple,  easily-controlled  processes.  Flight  measurement  is  not  simple 
or  easily-controlled. 

In  the  past  year,  WSMR  has  increased  feedback  on  field-record 
assessments;  and  on  which  stations  are  thrown  out  in  data  reduction. 

WSMR  has  also  initiated  feedback  on  average  angular  errors  of  each 
(optical)  station  (Fig.  4);  and  feedback  on  final-data  quality  (Fig.  5). 
Purposes  of  these  feedbacks  include:  input  to  an  "calibration"  of 
station-selection  (computer)  programs  (P.ef.  6);  and  input  to  (the  actual) 
instrumentation  plans.  Which  feedbacks  have  the  greatest  "profit 
potential"  •  and  what  the  optimum  and  achievable  time  frames  are  - 
remains  to  be  determined. 

.Some  range  personnel,  who  are  not  quality -minded,  make  a  counter¬ 
issue  of  "timeliness"  (of  data  delivery).  So,  the  (Plans  &<  Operations 
Directorate's)  Quality  Assurance  Committee  has  adopted  that  word.  The 
Committee  is  stressing  timeliness  of  feedback  --  timeliness  of  quality 
reporting,  as  well  as  of  data  reporting. 

It  should  be  realized  that  final-data  quality  reports  are  (also)  a 
formal  system  for  knowing  Range  capabilities  --  the  beginning  of  such 
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a.  system.  Copies  of  final -data  quality  reports  now  bo  to  Vrhite  Sands' 
(long-Term)  planners.  These  reports  also  serve  as  feedup  to  top  manage 
ment,  They  put  numbers  on  some  of  WSMR'i  technical,  operating,  and 
management  problems. 

FOLLOWUP.  Assurance  of  followup,  in  a  procedural  sense,  is  a 
Quality  Control  responsibility.  Actual  followup  is  an  operating  respon¬ 
sibility. 

When  Data  Reduction  QC  gets  a  very  bad  average  precision  of  final 
data,  they  check  to  be  sure  the  analyst,  is  not  including  the  poorest  part 
of  that  trajectory  in  the  formal  report. 

When  an  optical  station  has  the  same  (major)  deficiency  in  its  field 
record  for  two  consecutive  tests,  Data  Reduction  assessment  personnel 
report  this,  by  telephone,  to  Optical  Division  technical  personnel.  When 
this  proves  insufficient,  Data  Reduction  will  send  a  (written)  memo  to 
Data  Collection  -  requesting  a  reply  stating  what  corrective  action  has 
been  taken,  and  what  further  action  is  planned. 

The  writer  suggested  Data  Reduction  look  at  average  reiative  bias 
of  each  (cinetheodolite)  station  --by  taking  algebraic  means  of  angular 
residuals  (from  least-squares  solutions).  A  good  deal  of  what  WSMR 
treats  as  random  error  is  persistent  bias.  It  turns  out  this  step  will 
not  be  practical  until  WSMR  haw  quality  control  per  segment  of  trajectory 
-  .10  average  relative  bias  (of  a  station)  will  be  with  respect  to  a  single 
group  of  stations.  The  fact  the  reference  changes  with  each  station 
added  (or  subtracted)  shows  the  extent  of  the  station-bias  problem. 

Of  course,  White  Sands  is  setting  its  sights  on  controlling  (both) 
the  level  and  the  test-to-test  consistency  of  average  precision  (of  data). 
To  do  this,  it  needs  to  learn  how  to  break  down  the  firing-to-firing 
variability  (for  a  given  project)  into  that  due  to:  project;  weather; 
collection;  reduction;  other.  Major  factors  are:  number  of  stations; 
where  the  missile  flies;  reliaollity  of  stations;  "visibility";  relative 
locations  of  stations;  quality  of  stations. 

WSMR  won't  have  real  control  until  it  moves  itB  feedback  out  of 
a  management  time  frame  into  an  operating  time  frame.  It  is  also 
necessary  to  increase  supervisory  awareness  of  the  quality  feedback 
(and  feedup)  now  available, 
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White  Sands  is  slowly  moving  toward:  reliability  and  precision  standards 
for  each  type  of  instrument  and  system;  clearly  defined  responsibilities  (of 
Collection  and  Reduction)  in  relation  to  final -data  quality;  functional  manage¬ 
ment  of  this  shared  responsibility. 

Mr.  Bicking  essayed  an  analysis  of  variance  of  undesigned,  operational 
precision  data.  He  was  able  to  test  whether:  stations,  film-reading 
machines,  (human)  readers,  etc.  had  significant  effects  on  data  precision. 

He  was  not  able  to  determine  the  amount  by  which  each  affected  precision. 
Further  investigation  (of  this  approach)  is  certain  to  be  fruitful. 

Range  support  is  a  hard  place  to  carry  out  classical  design  of  experi¬ 
ment.  The  missile  Project  does  the  test  design  The  Range  has,  occa¬ 
sionally,  put  two  (similar)  instruments  at  the  same  site.  WSMR  runs  a 
range -calibration  "test"  (Ref.  7),  at  infrequent  intervals.  Instrumentation 
(support)  planning  is.  a  statistical-design  problem.  However,  current 
station -selection  computer  programs  (Ref.  6)  are  a  long  step  away  from 
representing  analyses  of  variance.  P&tO'e  Quality  Assurance  Committee 
aims  to  develop  the  Range's  quality-control  situation  to  the  point  where 
it  will  use  Evolutionary  Operation  (EVOP)  (Ref.  8).  Presently,  the  Range 
needs  to  carry  out  (more) correlations  -  and  analyses  of  variance  -  on 
undesigned,  operational  data.  WSMR  urgently  needs  a  statistical-calculat¬ 
ing  service.  It  also  needs  better  coordination  of  its  applied- statistics 
efforts. 

REAL  TIME.  Quality  control  applies  -  in  its  entirety  -  to  real-time 
data  support.  Specification  is  exactly  the  same;  but  WSMR  hasn't  built 
this  into  its  edition  of  the  National  Range  Documentation  (to  the  same 
extent)  -  yet.  Realtime  scorekeeping  and  feedback  can  be  carried  out 
on  a  firing -to -firing  basis.  To  some  degree,  they  can  be  included  in 
the  real-time  computer  program.  Followup  is  the  same  (problem)  as  for 
evaluation  support. 

STATUS  OF  QUALITY  CONTROL.  The  suggestion  to  use  (a 
standardised)  average  precision  as  an  index  of  overall  data-support 
performance  was  made  five  years  ago,  by  this  writer  (Ref.  9).  It  took 
four  years,  one  Committee,  and  John  Carrillo  (of  Data  Reduction)  to 
implement  this. 

In  applying  statistical  quality  control  to  data  support,  White  Sands 
is  running  counter  to  Thurman  Arnold's  corollary  (to  Parkinson's  law); 
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No  new  government  activity  can  possibly  be  effectively  carried  out  by  any 
established  government  ration. 

Only  a  little  over  a  year  ago,  the  most  important  product  of  P&O's  Quality 
Assurance  Committee  was  hope  --  hope  that  data  support  could  be  put  on 
a  more  objective  basis. 

The  focus  of  Quality  Control  has  caused  White  Sands  to  correct  a 
few  errors  in  its  data- reduction  methods. 

P&O's  Quality  Assurance  Committee  is  still  selling  quality  control 
to  operators  --  as  a  tool  for  tlv  ir  use  --  not  as  a  club  held  by  Manage¬ 
ment.  Data  Collection  (people)  recently  asked  that  Data  Reduction's  Field 
Record  Quality  Report  be  discontinued  --  on  the  ground  that  Data  Reduction's 
assessments  were  not  valid.  Data  Collection  personnel  have  since  been 
told  to  exchange  assessment  sheets  with  Data  Reduction  -  both  ways  -  to 
improve  understanding.  A  Quality  Assurance  Subcommittee  is  developing 
a  single  set  of  standards,  and  a  single  SOP,  for  assessment  of  optical 
records.  Data  Reduction  Groups  on  the  Ranges  are  predominantly 
mathematicians.  To  this  writer,  they  seem  inclined  toward  monodlsci- 
plinary  and  laboratory  viewpoints  •  and  to  favor  a  priori  approaches. 
Resistance  by  data -reduction  personnel  to  quality  control  may  have  been 
duo  to  QC's  management,  and  factory,  and  a  posteriori  connections.  But, 
quality  control  is  not  the  factory  in  science.  It's  science  in  the  factory, 

(This  is  now  recognized  by  White  Sands  data- reduction  personnel.) 

Data^ollection  Quality  Control  has  been  mainly  concerned  with 
solving  the  problems  of  station  reliability. 

There  is  still  a  need  to  sell  quality  control  (of  range  support)  to 
various  echelons  of  Range  Mangement  --  as  a  tool  for  their  use  --  not  as 
a  constraint.  The  key  to  selling  Management  probably  lies  in  the  fact 
that  -  for  data  support  -  quality  control  is  resource  control.  The  prospect 
of  better  bridging  the  (communication)  gulf  between  Management  and  data  - 
analysis  (personnel)  may  cause  some  discomfort  on  both  sides.  Of 
cource,  at  some  date,  White  Sands  will  have  to  bring  cost  into  its 
Quality  Control  picture.  Specifically:  precision/manhour,  precision/ 
dollar,  and  value  of  precision  (as  distinquished  from  cost). 

PSYCHOLOGICAL  IMPACT  OF  QUALITY  CONTROL.  It  is  this 
writer's  observation  that  truth  for  the  sake  of  the  mission  is  psychologically 
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closer  to  truth  for  its  own  sake  than  it  is  to  truth  as  an  instrument  of 
power. 

Negative  reactions  to  quality  control  appear  to  be  due  to  resistance 
to  change  -  and  to  dislike  of  the  "criticism"  inherent  in  scorekeeping 
(feeling  threatened  by  any  demand  to  be  objective).  Quality  control  is 
partly  an  educational  *  and  re -educational  •  problem.  AMETA  gave  a 
composite  of  its  basic  and  advanced  Statistical  Quality  Control  courses 
at  White  Sands.  This  writer  is  working  on  an  executive  primer  of 
flight -measurment  quality  (and  specification).  WSMR  may  bring  in  a 
quality -control  speaker.  The  Chief  of  Data  Collection  Quality  Control 
has  written  a  memo  to  the  individual  (field)  operators,  and  their  super¬ 
visors,  asking  them  to  identify  •  verbally  or  in  writing  •  existing  or 
potential  causes  of  error;  and  to  grade  these  as  critical,  major,  or 
minor.  (This  will  also  be  an  input  to  the  work  of  the  optical -assessment 
Subcommittee. ) 

On  the  positive  side,  keeping  score  adds  meaning  and  significance 
to  any  game.  Keeping  score  makes  how-to-play  (lpw-to-operate)  more 
important  -  not  less.  It  Improves  the  motivation  and  morale  of 
functionally -oriented  people. 

PHYSICAL  ACCURACY.  This  paper  defines  accuracy  as:  the 
numerical  difference  between  any  value  and  the  "true"  value.  It  is 
further  necessary  to  say  that  the  "true"  value  must  be  a  reference 
physically  independent  of  the  value  characterised.  Physically  independent 
means;  the  errors  of  measurement  (of  the  two)  are  uncorrelated.  (Of 
course,  accuracy  is  the  inverse  of  the  "absolute"  error  defined  here  as 
its  measure. ) 

The  development  of  the  potential  of  its  star-reference  BC-4  camera 
system  is  White  Sands'  only  real  hope  for  an  absolute -accuracy  reference 
for  flight  data. 

WSMR  could  derive  accuracy  (as  well  as  precision)  estimates  for 
two  kinds  of  data  in  its  current  operation.  Besides  star-referenced 
ballistic-camera  data,  it  could  do  this  for  (launch  and  terminal)  fixed- 
camera  data  -  in  which  the  reference-target  poles  are  photographed  in 
the  same  frame  as  the  missile. 
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Of  course,  measurements  cannot  be  consistently  accurate  unless 
they  are  also  precise.  Differences  in  the  accuracies  of  stations  affect 
system  precision. 

This  writer  holds  that  being  more  definite  and  quantitative  about 
precision  will  increase  (range  and  user)  understanding  of  accuracy  - 
and  awareness  of  specific  needs  for  it. 

VALIDATION  OF  REQUIREMENTS.  Range -support  personnel  are 
often  asked  "Why  don't  you  tell  them  they  don't  need  all  this  data?" 

Of  course,  user  requirements  should  be  based  on  miBBile  technology 
and  missile-test  design.  Support  personnel  have  no  particular  qualifica¬ 
tions  in  those  fields. 

This  writer  has  suggested  an  approach  to  deriving  measurement 
requirements  in  which  the  Range  can  assist  the  user  (Ref,  10). 

The  simplest  way  to  derive  estimates  of  required  data  quality  is  to 
convert  missile -performance  tolerances  to  measurement  "tolerances" 

--  directly,  when  they  are  the  same  variable  --  or  by  (complete) 
propagation  of  error  (formulas)  thru  an  equation  relating  the  performance 
variable  and  the  measured  variable.  As  Figure  6  shows,  the  resulting 
"tolerance"  must  then  be  tightened  --  on  the  basis  that  the  actual 
uncertainty  whether  missile  performance  meets  its  specified  tolerance 
is  the  sum  of  the  uncertainty  of  the  measured  performance  and  the 
allowable  uncertainty  of  the  specified  performance.  The  required 
measurement  tolerance  depends  on  the  level  of  risk  at  which  the  mis  Bile - 
using  agency  is,  practically,  willing  to  operate.  While  a  Range 
superiority  of  10  times  (in  standard  deviation  -  sacrificing  elegance 
for  clarity)  would  be  ideal  -  -  2-2^  times  is  the  necessary  level;  5 
times  is  certainly  the  sufficient  level. 
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DESIGNS  AND  ANALYSES  FOR  INVERSE 
RESPONSE  PROBLEMS  IN  SENSITIVITY  TESTING* 

M.  J.  Alexander  and  D.  Rothman 
Rocketdyne,  A  Division  of  North  American  Aviation,  Inc. 

Canoga  Park,  California 

INTRODUCTION.  Sensitivity  testing  is  that  area  of  experimentation 
in  which  each  test  is  characterized  by  a  quant al  response.  To  some 
sample  specimen  or  realization  of  a  system  one  or  more  stimuli  are 
applied  and  the  result  is  either  a  "response"  or  a  "nonresponse",  depend¬ 
ing  on  whether  some  critical  physical  threshold  was  or  was  not  exceeded 
for  that  particular  sample.  The  most  commonly  encountered  type  of 
sensitivity  problem  is  that  of  finding  at  what  level  of  the  stimulus  vari¬ 
able  a  given  percent  response  will  occur.  For  example,  in  biological 
assay  it  is  often  necessary  to  determine  the  dose  (called  LD  50  or  ED  50) 
which  is  effective  half  the  time,  and  in  testing  explosives,  it  is  often  of 
interest  to  find  the  stress  that  results  in  a  detonation,  say,  95%  of  the 
time.  In  each  of  these  situations  we  are  concerned  with  inverting  the 
relationship  which  gives  the  probability  of  a  response  as  a  function  of  the 
stimulus;  thus  the  terminology  (probably  due  to  J.  W.  Tukey)  of  the 
"Inverse  Response  Problem,  " 

The  general  problem  can  be  stated  more  precisely  as  follows; 

Suppose  we  have  a  stress  variable  x,  and  suppose  that  a  test  at  this 
stress  can  result  in  a  response  which  is  either  "1"  or  "0".  This  is  the 
well  known  quantal  response  experiment.  Let  M(x)  denote  the  mean  or 
average  response  fraction  at  x,  In  this  situation  M(x)  is  called  the 
response  function.  If  M(x)  is  monotone  nondecreasing,  it  may  be  thought 
of  as  representing  a  cumulative  distribution  function,  as,  for  example, 
the  cumulative  normal  distribution 

f*  --1— -  e*( y-p)2/2T2  dy. 

J  v  2tt  c 

-  00 


In  most  cases,  however,  the  explicit  form  of  M(x)  is  not  known. 


‘■'This  work  was  supported  by  the  George  C.  Marshall  Space  Flight 
Center,  NASA,  Huntsville,  Alabama,  under  Contract  No.  NAS  8-11061. 
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For  the  inverse  response  problem  the  experimental  objective  is  the 
estimation  of  that  x  =  (generally  unique,  out  not  necessarily  so)  at 

which  M(x)  =  a  ,  for  a  given  value  of  a  .  We  shall  be  concerned  here 
both  with  experimental  designs  for  the  inverse  response  problem  and 
methods  for  analysing  the  test  results. 

The  first  approach  to  this  problem  was  based  on  the  use  of  the  probit 
design  [l]  which  was  originally  formulated  for  biological  applications. 
This  design  requires  a  fixed  number  of  tests  at  each  of  a  given  set  of 
stimulus  levels,  and  thus  a  large  number  of  tests  is  necessary.  The 
analysis  generally  used  with  the  probit  design  is  based  on  assumptions 
concerning  the  response  function  M(x)  and  the  objective  of  the  analysis 
is  the  determination  of  response  function  parameters.  Once  these  have 
been  estimated,  the  solution  of  the  inverse  response  problem  can  be 
obtained  for  any  a  . 

When  cost  of  availability  of  materials  is  an  important  consideration, 
the  probit  design,  because  of  the  larger  number  of  tests  involved, 
becomes  impractical.  To  obtain  estimates  of  x^  ,  particularly  for 

a  =  0.  5,  in  fewer  tests,  a  sequential  design  was  introduced  in  1943  at 
the  Explosives  Research  Laboratory  at  Bruceton,  Pa.  [2]  .  The  rules 
for  the  Bruceton  or  up-and-down  design  require  increasing  the  stimulus 
by  a  fixed  step-size  after  a  nonresponse  and  decreasing  the  stimulus 
one  step  after  a  response.  The  up-and-down  design  is  still  the  most 
widely  known  and  most  extensively  used  teat  procedure,  particularly  for 
explosive  testing  and  other  engineering  applications.  For  a  a  0.  5  it  is 
used  in  conjunction  with  the  Dixon-Mood  (3]  or  more  recently  Dixon  [4] 
analysis,  both  of  which  assume  that  the  response  function  M(x)  is 
cumulative  normal. 

Other  methods  generally  used  with  distributional  assumptions  are 
the  Langlie  [10]  and  rundown  designs.  When  these  procedures  are  used 
to  estimate  xq  for  values  of  a  near  ,  5,  inappropriate  distributional 

assumptions  do  not  have  a  critical  effect  on  the  efficiency  of  the  design. 
However,  for  more  extreme  values  of  a  the  situation  is  more  critical 
not  only  because  the  tails  of  the  response  distribution  are  more  sensi¬ 
tive  to  inappropriate  assumptions,  but  also  the  estimates  of  x^  in 

these  cases  are  generally  less  robust. 
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in  tv.*  >hi»nc*  nf  distributional  assumptions  on  M(x)  the  inverse 
response  problem  was  first  attacked  directly  by  Robbins  and  Monro  [5] 
who  employed  a  stochastic  approximation  design.  In  this  procedure  the 
step-size  is  no  longer  fixed,  and  the  rules  for  determining  successive 
test  levels  depend  only  on  the  last  test  (as  in  the  up-and-down  design). 

The  levels  converge  to  the  desired  critical  level,  x^  ,  not  only  in 

probability  but  with  probability  one.  This  design  and  its  variations, 
Kesten  [6]  ,  Odell  [7]  ,  and  delayed  [8]  ,  particularly  the  latter,  are 
slightly  more  efficient  than  the  Bruceton  or  probit  designs  for  o  =  0.  5. 
However,  for  more  extreme  values  of  a  (e.  g.  ,  a  =  0.05,  0.95),  simula¬ 
tions  [9]  indicate  that  the  design  seems  to  be  much  less  efficient  than 
expected. 

For  many  years  the  major  attention  in  the  inverse  response  problem 
was  focused  on  the  case  a  =  0.  5.  For  this  problem  both  the  up-and-down 
and  Robbins-Monro  types  of  designs  give  reasonable  answers  in  about 
6-12  tests.  However,  reliability  and  safety  problems  require  estimates 
of  x  fora  <  .05  or  a  >  .95.  For  such  extreme  values  of  a,  when 

a  —  — 

prior  knowledge  of  the  response  function  M(x)  is  limited,  it  was 
necessary  to  consider  new  design  and  analysis  procedures.  In  the 
Robbins-Monro  design,  successive  test  levels  are  determined  from  only 
the  previous  test  results.  One  would  expect  that  improved  estimates 
of  x  could  be  obtained  if  all  data  were  analyzed  before  the  next  test  level 
was  selected. 

The  two  designs  described  in  this  paper  were  formulated  from  this 
point  of  view;  they  give  good  results  with  limited  sample  sizes  for 
a  ~  .  05  (.  95)  and  are  still  useful  in  many  applications  for  a  ~  .  02  (.  98). 
One  design  is  appropriate  when  it  is  desired  to  continue  testing  on  a  set 
of  discrete  test  levels  until  a  specified  precision  in  the  estimate  of  xq 

is  attained.  The  other  is  appropriate  when  the  sample  size  is  fixed  in 
advance  and  there  are  no  restrictions  on  test  levels,  Both  designs  have 
.  been  evaluated  by  simulation  and  it  is  shown  that  they  compare  favorably 
with  existing  procedures  and  with  a  conjectured  asymptotic  criterion  for 
distribution-free  inverse  response  problems. 
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ALEXANDER  DES  ON 

GENERAL  DESCRIPTION.  In  sequential  designs  new  levels  for 
testing  are  determined  from  previous  test  results,  and  this  may  be 
accomplished  in  many  ditterent  ways.  In  the  Alesanrier  design  the.  step 
size  is  constant  but  (unlike  the  brureton  ami  Rooblnw-Mouro  procedures) 
new  test  levels  depend  on  all  previous  test  resuits,  It  is  assumed  only 
that  the  response  function  M(x)  is  monotone  nonde  or  easing,  so  that  the 
design  is  otherwise  distribution- free,  it  uses  alternately  increasing  and 
decreasing  sequences  to  oouno  the  sought-tor  stimulus  level  .  Test¬ 
ing  ends  when  is,  with  a  specified  proDability,  located  within  an 

interval  of  length  rot  mors  than  2A  ,  where  A  is  rhe  step  size.  From 
an  estimate!  of  this  interval,  in  estimate  of  >:  can  o<  found  in  torn  by 
linear  interpolation. 

The  initiation  and  termination  rules  for  the  soquein.es  arc.  <  mined 
in  terms  of  monotone  estimates  of  the  response  probabilities  at  the  test 
levels.  In  one  version  of  the  design,  which  should  be  used  for  a  near 
0.5,  maximum-likelihood  estimates  are  used.  However,  for  extreme 
values  of  a  ,  it  is  more  efficient  to  use  both  maximum-likelihood  esti¬ 
mates  and  certain  estimates  based  on  confidence  bounds  which  will  be 
described  subsequently. 

Simulations  of  both  0.  5  and  0,  05  designs  have  been  carried  out. 

The  design  is  generally  quite  efficient  relative  to  other  available  distri¬ 
bution-free  designs,  and  is  roughly  as  efficient  as  the  best  parametric 
stochastic  approximation  when  distributional  assumptions  on  M(x)  can 
be  made, 

The  general  rules  for  the  design  may  be  described  as  follows; 

1.  The  first  test  Is  at  L, ,  the  a  prior'  best  guess  of  x  . 

1  a. 

2.  By  the  method  of  reversals  (Appendix  I)  monotone  estimates 
are  evaluated  at  all  test  levels  after  each  test. 

3.  Testing  will  be  performed  by  alternately  increasing  and 
decreasing  sequences  of  test  levels. 

4.  The  first  test  of  an  increasing  (decreasing)  sequence  is  at 
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either  the  highest  (lowest)  level,  strictly  above  (below)  the  last 
test  level,  at  which  the  estimate  is  less  (greater)  than  or  equal 
to  a  or,  if  there  is  no  such  level,  at  the  level  above  (below)  the 
last  test  level. 

5.  An  increasing  (decreasing)  sequence  will  be  terminated  at  the 
first  level  at  which  the  estimate  after  a  test  at  that  level  is 
strictly  greater  (less)  than  a  . 

6.  The  rules  for  ending  the  design  depend  on  the  value  of  a  and 
are  given  explicitly  below, 

THE  DESIGN  FOR  a  =  0.5.  For  a  =  0.  5  the  estimates  used  in 
following  the  design  rules  are  the  maximum-likelihood  estimates  given 
by  the  method  of  reversals.  When  the  testing  is  finished  we  wish  to 
have  an  intervj.l  I  such  that  Probfx^il)  >  P,  where  P  is  some  pre¬ 
scribed  probability.  The  length  of  I  depends  on  the  particular  experi¬ 
ment;  it  is  never  more  than  twice  A,  the  step  size,  but  in  most  cases 
it  is  A.  The  occurrence  of  an  interval  of  length  2A  corresponds  to 
the  situation  when  is  at  the  center  of  1.  Testing  will  be  stopped 

when  either  of  the  following  conditions  is  satisfied:  (a)  there  are  three 
adjacent  test  levels  <  Lz  such  that  the  response  estimate  at 

is  .5  and  the  response  estimates  pQ  and  p2  at  levels  and  L^, 
respectively,  lead  to  the  confidence  statements 

Prob  {pQ  >  .  5  }  <  (l-P)/2 

Prob  (p2  <  .  5}  <  (l-P)/2 

or  (b)  there  are  two  adjacent  levels  for  which  the  above 

confidence  statements  can  be  made. 

When  P  is  .  5  then  the  conditions  for  and  are  given  by 
the  following  table: 


a 


i 

\ 

i 

■i 

i 

i 

i 

i 


$ 

! 

i 
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B 

2 

3 

4 
6 

7 

8 
9 

10 
12 

13 

14 

In  this  table,  A  denotes  the  number  of  responses  at  and  5 
denotes  the  minimum  number  of  nonresponses  which  must  be  observed 
at  the  level  for  the  condition  In  (a)  to  be  satisfied.  Similarly,  if  A  is 
the  number  of  nonresponses  at  then  at  least  B  responses  at 

are  required  for  termination. 

THE  DESIGN  FOR  EXTREME  VALUES  OF  a  .  In  a  desirable 
distribhtion-free  design  for  the  inverse  response  problem,  most  of  the 
test  levels  are  concentrated  in  a  region  around  .  Therefore,  when 

a  ~  .  05  we  would  expect  on  the  average  19  nonresponses  for  every 
response.  Thus  in  this  case  a  good  design  forces  some  testing  in  the 
stimulus  region  below  tho  lowest  level  at  which  a  response  has  been 
observed.  Since  the  maximum-likelihood  estimates  of  the  response 
probabilities  are  all  zero  in  such  a  region,  a  new  kind  of  "estimate" 
will  be  Introduced  to  insure  a  sufficient  number  of  zero  responses.  This 
"estimate"  is  actually  used  only  to  determine  when  to  terminate  a 
decreasing  sequence.  The  method  is  most  easily  introduced  in  terms 
of  an  example. 
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1 

2 
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4 

5 
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7 
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9 

10 


Suppose  that  after  some  testing  the  following  responses  have 
occurred 


<h  <  L2  ‘  Li  <  L4>; 


1 
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thV  ’•  on*  nfmrMmmi*  at  each  of  the  levels  L.  and  L, ,  two  nonresponses 
*  i  c 

at  L„  and  one  response  at  L.<  At  L, ,  L,  and  L,  we  would  like  to  obtain 
3  4  12  3 

estimates  which  satisfy  the  monotonicity  assumption  on  M(x)  and  which 
indicate  that  it  is  likely  that  the  actual  response  fractions  at  these  levels 
are  greater  than  aero.  We  will  accomplish  this  by  introducing  an  appro¬ 
priate  confidence  bound,  In  the  example  being  considered  two  nonresponses 
were  observed  at  and  either  from  binominal  tables  or  the  equation 

(1)  (l-P3)N  =  1-P  (N  =  2) 

'V 

one  can  obtain  an  estimate,  p^<  for  a  given  probability  P  (specified  in 
advance)  such  that 

Prob(p3<p3-}  >P  . 

If  the  same  criterion  is  used  at  a  larger  estimate  than  that  at  L>3 

should  result.  To  insure  monotonicity  an  interval  estimate  will  be  used. 
This  will  be  accomplished  by  introducing  a  "zero  region"  for  each  level 
defined  as  that  level  and  all  consecutive  higher  levels  at  which  no  responses 
have  occurred.  Thus  the  zero  region  Z 2  for  is  the  interval  (li2>  L3) 

and  similarly  =  (L^,  L3).  The  estimate  for  L.,  can  then  be  found  from 

(1)  with  N  s  3  (the  number  of  zeros  in  Z^). 

The  objective  in  using  the  new  type  of  "estimate"  is  to  be  reasonably 
sure  that  decreasing  sequences  end  below  x  .  From  the  rules  of  the 

design,  a  decreasing  sequence  will  terminate  at  level  Lq  where  p^  <  a . 

Because  of  the  way  p  is  defined  the  following  confidence  statement  can 
be  made: 


Prob  {po  <  a  |  observed  responses  }  >  P  ; 

i.e. ,  on  the  basis  of  the  observed  responses  the  probability  that  Lq  is 
below  xft  is  greater  than  P.  For  each  decreasing  sequence,  the  same 
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total  number  of  nonresponses  in  the  appropriate  zero  region  will  be 
required  for  termination.  Thus,  it  is  not  necessary  to  determine  estimates 
at  each  level.  Instead,  from 

(2)  (l-a)n  *  \-P  ,  N  =  [n]  +  1 


one  can  determine  the  appropriate  N  for  a  given  P  and  then  it  io  only 
necessary  to  count  zeros  in  the  zero  region. 

A  uniform  set  of  rules  tor  the  design  can  now  be  given; 

1.  The  first  test  of  an  increasing  sequence  is  at  the  level  below  the 
lowest  level  at  which  a  response  has  been  observed.  If  the  result 
of  this  test  is  a  response  the  sequence  ends;  otherwise,  one  more 
test  (at  the  next  higher  level)  is  performed. 

2.  The  first  test  of  a  decreasing  sequence  is  at  the  level  below  the 
lowest  level  at  which  a  response  has  been  observed.  The  sequence 
ends  at  level  Lq  whose  zero  region  contains  at  least  N  non¬ 
responses.  Values  for  N  can  be  found  from  (2).  The  following 
table  gives  values  of  N  for  P  =  ,5 


o  N 


.1 

7 

.09 

8 

.08 

9 

.  07 

10 

.  06 

12 

.  05 

14 

.  04 

17 

.  03 

23 

.  02 

35 

.  01 

69 

3.  Testing  ends  when  there  are  three  adjacent  levels  Ln,  L, ,  and 

V  a 

L*2  such  that  at  least  one  response  has  been  observed  at  (and 

none  at  a  lower  level),  and  a  total  of  at  least  N  nonresponses 
has  been  observed  at  and  L^.  The  value  of  N  is  given  in  the 
preceding  tabie. 
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4. 


An  estimate  .:an  be  found  by  linear  interpolation  between 

f  A  r*  A  K  rvirKdKi  1  if  4  a  *  -if  I  '»  v"  ^  ? 

r . . . — . . ~i  ~r 


The  Alexander  designs  have  the  virtue  that  once  the  rules  are  under¬ 
stood,  the  actual  procedure  is  fairly  straightforward  and  the  calculations 
required  between  tests  are  extremely  Bimple.  Of  course,  as  with  any 
distribution-free  design,  distribution  assumptions  can  always  be  adopted 
after  testing  is  complete.  If, for  example,  it  is  desired  to  find  x  ^  under 

the  assumption  of  an  underlying  cumulative  normal  distribution,  the 
estimates  determined  from  the  data  generated  by  this  design  are  some¬ 
what  better  than  those  based  on  the  data  obtained  from  an  up-and-down 
design,  and  are  in  fact  almost  as  good  asymptotically  as  the  optimum  for 
the  1%  cumulative  normal  inverse  response  problem.  Furthermore,  any 
departure  from  normality  will  probably  affect  the  estimates  obtained 
from  these  data  much  less  than  the  (extrapolated)  estimates  gotten  from 
up-and-down  data.  One  of  the  advantages  of  these  designs  is  the  small 
number  of  tests  required.  An  estimate  of  the  expected  upper  bound  N 
is  given  by 

N  =  2N(l  +  i  +  ...  +^575") 


N  = 


Nq  ,  Nq  an  integer 
[Nq]  +1  ,  otherwise 


When  Pq  *  .  5  this  gives  an  expected  upper  bound  of  76  tests  for  a  =  .  05. 

The  design  has  been  simulated  for  a  =  .  5  and  .  05.  It  appears  that 
this  design,  particularly  for  extreme  values  of  a  ,  is  more  efficient 
than  other  nonparametric  designs  which  are  not  based  on  analysis  of  all 
previous  results  at  each  stage.  In  our  simulations,  the  median  number 
of  tests  was  about  64  for  a  =  .  05;  for  a  s  ,  5  the  median  number  of  teste 
was  about  16. 

EXAMPLES: 

1.  In  the  following  simulated  example,  the  Alexander  design  is  used 
for  a  =  .  5,  A  =  .  5,  with  a  cumulative  normal  response  function, 

p  =  0,  i  s  1; 
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I 

1 

i 


l 

! 


I 

1 

6 


Test  Number _ Stress _ Re sponse 


1 

1.  3 

1 

o 

u 

a 
*  w 

i 

3 

.  3 

0 

4 

.  8 

1 

5 

.  3 

1 

6 

-.  2 

1 

7 

-.  7 

0 

8 

-.  2 

0 

9 

.  3 

1 

10 

2 

1 

11 

-.  7 

0 

12 

-.  2 

1 

Since  the  response  traction  at  -.2  is  3/4,  while  at  .  3  it  is  2/3, 
the  method  oi  reversals  must  be  used,  giving  5/?  at  2  and  at 
.  3.  Linear  interpolation  between  5/7  at  -.2  and  0/2  at  -.7  gives 
as  final  estimate  x  .  =  .35. 

2.  The  following  data  were  simulated  using  a  normal  response 
function  with  p.  =  0,  <r  =  1,  so  that  for  o  =  ,  05,  x&  =  -1.  645. 

The  first  test  was  at  -3 <r  and  the  step  size  chosen  was  .  25<r  . 
(The  X 's  and  O's  indicate  responses  and  nonresponses, 
respectively. ) 


Stimulus  Level 


Teat  Results 


-3.  00 
-2.  75 
-2.  50 
-2.  25 
-2.  00 
-1.75 
-1.  50 
-1.  25 
-1.  00 

-  .75 

-  .  50 


0 

0 

0  0 
0  0 
0  0 
0  0 
0  0 
0  0 
0  0 
0  X 
X 


0 

0 


0 

0 

0 

0 

0 


0 

0 


0 

0  0 

0  0  0 

0  0  0  0  0 

X  0  0 

0 
0 


i 

l 
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[ 


► 


i 


t 

1 


m 
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In  the  above  table,  the  columns  indicate  sequences  (I  for  increasing,  D  for 
decreasing).  The  final  estimate  is  obtained  by  linear  interpolation  which 
yields  xn  =  -1.  67  5.  Note  that  a  total  of  44  teBts  was  required. 


ROTHMAN  DESIGN 


BACKGROUND.  The  second  new  design  for  the  inverse  response 
problem  is  built  on  a  design  by  Marks  [ll]  for  locating  the  step  in  a  step 
response  function.  Thus  we  shall  begin  with  a  brief  review  of  that  design 
in  the  case  of  infinite  sample  size  (the  same  design  is  very  nearly  optimum 
even  for  small  samples). 


Let  the  step  response  function  M(x)  be  such  that 


M(x) 


0  ,  x  <  x 

8 

M.  ,  x  =  x 
0  s 

1  ,  X  >  X 

8 


and  0  <  Mq  <  1 . 


Suppose  we  have  some  previous  estimate  of  the  step  location  xg  which 

we  denote  by  xg  and  which  we  assume  is  normally  distributed  with  unknown 

mean,  xg,  and  known  standard  deviation,  w.  Let  the  successive  test 

levels  be  L^  (i  =  1,  2,  .  .  .)  and  the  response  at  L^  be  R^.  The  first  test 

is  at  that  stress,  L.  =  &  ,  which  is  the  best  prior  guess  of  x  .  Then 
IS  s 

Lj-l.Hw  if  R  =1 
Ll  +  1. 17w  if  Rj  =  0 


Since  the  design  is  symmetric  about  L^, 
levels  only  for  R^O;  these  are  4 


L 


3 


L^  + . 55w 
Lj  +  1. 99w 


we  shall  give  the  next  two  test 

Rl  =  0-  R2=l 
Rj  =  0,  r2  =  0 


I 
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R,  =  0,  R,  =  1,  R,  =  1 
R  =  0,  R,  s  1,  R  =  0 

1  4-  J 

Rj  »  o,  r2  =  o,  r3  =  l 

V°*  R2  =  0,  R3  -•  0 


In  order  to  simplify  the  computations,  the  following  approximation,  which 
only  slightly  affects  the  efficiency  of  the  Marks  design,  will  be  used: 

1,  If  P.  =  R  =  •  •  •  =  R. ,  then  L  =  L.  4  1. I67w//1,  i  =  1,  2,  • •  • 

12  l  l+l  l  — 

2,  For  all  other  cases  the  successive  test  levels  are  determined  by 
"splitting  the  difference"  between  the  lowest  1  and  the  highest  0, 

For  the  fourth  test,  for  example,  this  approximation  gives  (.for  the  same 
result  situation  as  above) 

~  292w 
875w 
1.  580w 
2. 666w 

The  effect  of  these  small  changes  from  the  Marks  values  on  the  efficiency 
of  the  design  is  negligible.  In  fact  Marks  has  shown  [ll]  that  even  larger 
changes  do  not  have  a  significant  effect. 

It  is  interesting  to  note  that  the  factor  1  /V  i  can  be  thought  of  as  a 
compromise  between  the  term  l/i  in  the  original  Robbins-Monro  process 
and  the  constant  step  used  to  start  the  delayed  R-M  process. 

RULES  FOR  ROTHMAN  DESIGN,  If  it  is  known  that  w  is  very 
large  compared  to  the  distance  of  the  interval  in  which  the  response  func¬ 
tion  essentially  goes  from  0  to  1,  then  it  is  obvious  that  the  Marks  design 
could  very  profitably  be  used  for  the  first  few  tests.  Thus  we  propose  the 
following  de  sign: 


L4C 


Ll  + 


and 


L4  = 


L,  +  ,  27  3w 


+  .  847w 


L  4  .  1.  537w 


L  4  2.  657w 


'  e  ■, 
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Lj  =  y  +  1. 167w(a-.  5) 

where  y  ueuutci  tl»«  cxpsrirr.cr.tcr 'c  initial  v  The  second 

test  is  at°  a 

L2  =  L  +  1. 167w 
if  the  first  result  is  a  0,  and  at 


L2  *  L  *  I.167w 


if  it  is  a  1.  The  general  rules  for  planning  the  (r+1)  test  are: 

1.  After  the  r*^  test,  all  of  the  data  are  analyzed  by  the  method  of 
reversals  (Appendix  I). 


2.  Compute 


1  A 

Y  8  £  (1/i)  =  y  +  *n(r+.  5)  +  1/2 4{r+.  5) 

r  i-1 


where  y  is  Euler's  constant  y  *  .  r  72  •  •  •  .  This  quantity  is  asymptotic 
to  the  expected  number  of  plateaus  given  by  the  method  of  reversals. 

Thus  r/Y^  is  roughly  the  average  number  of  points  which  have  gone 

into  each  response  estimate. 


3,  Compute 


A  =  .6745  -\Ja(l-a)  Yr/r 


4,  If  there  are  any  stress  levels  at  which  the  estimated  response  is 
greater  than  or  equal  to  min(a  +  A  ,  1),  let  denote  the  lowest  of 

these.  If  there  are  any  stress  levels  at  which  the  estimated  response 
is  less  than  or  equal  to  max(a  -  A  ,  0),  let  S2  denote  the  highest  of  these. 
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5.  If  neither  nor  exists,  let 

Lr+1  *  L1  • 

If  exists,  but  not  S^,  let 

Lr+1  =  (S1+Lr)/2  ‘  Ll67w/^  ■ 

If  exists,  but  not  S^,  let 

L  =  (L  +  S,)/2  +  1.167w//r  . 

r+1  r  2  . 

If  both  and  exist,  as  is  generally  the  case  ior  large 
sample  sizes,  let 

Lr+1  =(S1+S2)/2  +  1,167w(a'ar}  * 


whex*e  is  the  fraction  of  responses  in  the  first  r  tests: 

r 

q  =  Z  R  /r 
r  i=l  r 

For  large  sample  sizes,  the  second  term  should  be  replaced  by 

(a-a  )/d  ,  where  d  is  an  estimate  of  M'(x  )  based  on  the 
r  r  r  a 

sample,  For  example,  d^  =  ZA  /(S^-S^)  could  be  used,  but 
only  if  there  is  some  data  in  the  interval  (S^,S^).  (Note  that 
this  interval  is  also  an  approximate  50%  confidence  interval  on 

v> 

SIMULATED  EXAMPLE  OF  DESIGN,  Let  a  =  .  05,  w  =  5,  and  the 
true  response  function  be  cumulative'HWrmal  with  p  =  0,  tr  =  1.  Then 
x^  -  -1.  645.  Suppose  y  =  -.2.  Then 
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1^  =  -.  2  +  (1. 167)(5)(.  05-.  5)  =  -.  2  -  2.  626  =  -2.  826 

Now  suppose  Kj  »  u  (R^  den< 
this  point  *  1,  A  »  .  6745  • 

Since  the  estimated  response  at  >2.  826  is  0,  and  since  this  is  the 
highest  level  at  which  the  estimate  is  less  than  or  equal  to  max(.05-.15,  0) 
*  0,  we  have  s  0,  Since  there  are  no  test  levels  at  which  the  estimated 

response  exceeds  min(.  05+.15,  1)  =  .20,  does  not  exist.  Then 

L2  MLj+S 2)/2  +  (1. 167)5/-/ 1 

=  (-2,826  +  5.835 

« 

=  3.  009 

Suppose  now  R^  *  1.  Now  we  have  S^  a  3.009,  S^  *  >2.826, 

L3  »  (.183)/2  +  1.167(5)  (.05-.  50) 

»  >2.  534 

Suppose  now  =  0.  Then  S^  ■  3.009,  S^  »  -2.  534, 

L4  ■  (.  575)/2  +  1. 167(5)  (.  05-.  3333) 

*  -1.  366 

Suppose  now  R^  »  0.  Then  “  3.  009,  S^  = 


ces  tne  reapimae  ,u  uie  a 


' .  05 X  .95  =  .15 


•1.  366, 
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L5  *  (i,643)/2  +  1.167(5)  (.05-.  25) 


=  -.  345 


If  R.  =  1,  then  S.  =  -.345,  S,  =  -1.366, 

!>  i  C 

L6  *  (-1.  711)/2  +  1. 167(5)  (.05-.  4) 
*-2.898 

The  design  might  continue  as  follows: 


r 

L 

r 

R 

r 

6 

-2. 898 

0 

7 

-2. 509 

0 

8 

-2,  231 

0 

9 

-2. 023 

0 

10 

-1.860 

0 

11 

-1.731 

0 

12 

-1.625 

0 

13 

-l.  536 

0 

14 

-1.461 

0 

15 

-1.  398 

0 

r  L  R 
r  r 


16 

-1.  342 

0 

17 

-1.  281 

0 

18 

-1.  208 

0 

19 

-1.133 

0 

20 

-1.  061 

1 

21 

-1.  681 

0 

22 

-1.  639 

0 

23 

-1.  601 

0 

24 

-1.  566 

0 

25 

-l.  535 

0 

r 

L 

r 

R 

r 

26 

-1.  505 

0 

27 

-1.  478 

0 

28 

-1.454 

0 

29 

-1.430 

0 

30 

-1.409 

0 

31 

-1.  389 

0 

32 

-1.  370 

0 

33 

-1.  352 

1 

At  this  point  the  analysis  of  results  by  means  of  the  method  of  reversals 
becomes  nontrivial,  The  estimate  is  1/5  =  .  2  at  -1.352.  It  turns  out 
that  Sx  =  -1.  352,  S2  *  -1.  366, 

L34  =  -1.  359  +  1. 167(5)(.  05-4/33) 

=  -1.  774 

We  continue: 


^  ndb.i  i  .  y» 


:•>**•'* 
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-1.774 

1  ^  CA 

■  a  •  i  v  * 

-1. 734 
-1.716 
-1.698 


Let  us  present  the  entire  analysis  at  this  step,  for  this  is  the  first 
time  it  is  possible  to  get  a  decent  estimate  of  M'(x^). 


Stress 

Responses/Trials 

E  stima 

3.009 

1/i 

1.  00 

-  .345 

1/1 

1.  00 

-1 .061 

1/1 

1.  00 

-1 . 133 

0/1 

.  20 

-1 . 208 

0/1 

.  20 

-1  .  281 

0/1 

1/5 

.20 

-1 . 342 

0/1 

.20 

-1  .  352 

!ZL 

.20 

-1  .  366 

0/1 

.  056 

-1 . 370 

0/1 

.056 

MAM  iwrwuv  •  f'4%^ 
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Stress 

Re  spons  e  s/T  rial  s 

Estimates 

-1.734 

0/1 

.  0 

-l.  roe 

»■*  /* 
v/t 

n 

-1.774 

0/1 

.0 

-1.360 

0/1 

.0 

-2.023 

0/1 

.0 

•  2.231 

0/1 

.0 

-2.509 

0/1 

.0 

-2. 534 

0/1 

.0 

•2.826 

0/1 

.0 

-2.893 

0/1 

.0 

(continued) 


Now  we  have  r=38,  Y  =  .5772  +  *n(38.5)  =  4.228, 
A  =  .6745  V(  -  05)  (.95)  (4.228)/38 
■  . 049  . 


Since  min(o  +  A  ,  1)  =  .  099,  we  have  Sj  *  -l,  352.  Since  max(a  -  A ,  0) 
.001,  we  have  Sy  -1.716.  Furthermore,  (S^.  S^)  is  not  empty,  so  we 
may  replace  l.l67w  *  5.835  by  our  estimate  of  l/M'(xa),  which  is 


V| 

2A 


.  364 
.098 


3.71 


(The  true  value  is  actually  l/M'(xa)  =  9.7,  so  we  have  accidentally 
adjusted  the  coefficient  in  the  wrong  direction.)  Then  L 

=  (-1,  352-1.  7l6)/2  +  3.  71  (.05-5/38) 

=  -1.837  . 


The  analysis  again  becomes  routine  until  the  next  1  occurs. 


i 
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Here  M'(x  . 

.v«..  .imniatinnn  do  not  diner  very  much,  we  shall 

dincc  me  w»  *•**  **•*--  - 

report  here  only  the  results  for  the  cumulative  normal  response  function. 
These  results  are  tabulated  below; 


Design  j 

a  *  .  5  ’ 

w 

Sample 

Sias 

(N) 

Asymptotic  Minimum 
Variance  for  «  *  .  5 
l«rZ/i  N) 

Variance  of 
Estimator  in 

100  Simulations _ 

Alexander  (A  *.l) 

“T4 

4  8  17* 

4a» 

>0 

tv 

O 

o 

-~o 

rv 

.60  .40  .14 

Rothman-*-1? 

l 

8  16  32  64  ! 

.20  .098  .049  .025 

.36  15  ,10  041 

Alexander  (A  *1) 

10 

4  «  14'? 

.39  .20  .11 

45.6  1.98  .39 

10 

8  16  32  64 

.20  .098  .049  .025 

1.10  .38  12  -Obi 

Asymptotic  Minimum 

Variance  of 

Variance  for  n  =  .  05 

Estimator  in 

- 

100  Simulations 

Alexander  (A».6zM 

T 

16  32  bi* 

.28  ,i4  ,072 

l.  59  .75  .IV 

Rothman 

l 

16  32  64 

.28  .14  .070 

.60  .24  .092 

Alexander  (A  *.  25) 

10 

16  32  65* 

.28  .14  .069 

17.5  5.88  .23 

10 

16  32  64 

.28  .14  .070 

1.04  .31  .14 

i-'"- 

I 

Asymptotic  Minimum 

Variance  of 

Variance  for  a  s  .  01 

Estimator  in 

Q  C  ,  01 

(13.91ir2/N) 

25  Simulations 

Rothman 

T 

64  128  2*r 

.22  .11  .054 

Rothman 

10 

;  64  128  256 

IV 

|V 

r— 

O 

.33  .U  .072 

m  Median  sample  size  required  to  complete  design 

as  Without  altering  design  to  incorporate  estimate*  ot  the  derivative 

>:>* >'/  Unsatisfactory  because  1 . 1 67 w  was  much  smaller  than  1,  M  (x  qj) 
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From  this  table  we  may  draw  certain  conclusions; 
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1.  The  Alexander  designs  are  excellent  if  completed  or  if  carried 
OUt  to  it  ICiCt  It  tilts  for  O  "  .  5  ?.nd  h*  t*  »t  n  (nr  r.  -  .  00. 

2.  The  Rothman  designs  are  excellent  for  smaller  sample  sizes. 
However,  if  large  samples  are  intended,  the  experimenter 
should  utilize  the  more  complicated  version  of  the  design  (not 
yet  simulated)  in  which  M'(x^)  is  eventually  estimated  from 

the  sample  and  then  used  to  modify  the  spacing  of  the  subsequent 
test  levels.  Otherwise,  as  in  the  anomalous  result  for  a  »  .  01, 
w  *  1,  we  may  find  that  the  initial  spacing  (based  on  1. 167w  rather 
than  on  an  estimate  of  l/M'(xa))  is  completely  inappropriate. 

A  comparison  of  all  simulations  reported  in  [15]  indicates  that  the 
Rothman  method  is  slightly  better  for  poor  prior  information,  and  the 
Alexander  design  is  slightly  better  for  small  w,  for  most  response 
functions  included  in  our  simulations. 

Simulations  of  other  50%  designs  have  appeared  in  the  literature. 
Wetherill  [9]  has  shown  that  for  the  50%  logit  problem,  the  Robbins- 
Monro  process  gives  an  estimator  with  variance  very  close  to  the 
asymptotic  minimum.  However,  his  initial  test  level  is  very  close  to 
the  level  sought,  which  corresponds  to  a  small  value  of  w  (i.e.  ,  a 
great  deal  of  prior  information).  But  the  R-M  process  would  be  very 
poor  for  small  samples  if  w  is  very  large.  Our  designs  are  Intended 
to  cover  both  cases,  and  it  follows  that  their  efficiencies  at  small 
values  of  w  are  therefore  somewhat  impaired. 

Wetherill  claims  that  small  sample  inefficiency  it  due  to  lack  of 
linearity  in  the  neighborhood  of  x^.  However,  there  is  a  "growth  of 

information"  (growth  of  efficiency  per  test)  aspect  of  small  sample 
work  for  any  response  problem  (cf  [15]  ,  pp.  212-220)  even  for  the 
homoscedastic  problem  on  a  straight  line  (non-quantal  response). 

Wetherill  apparently  found  unsatisfactory  the  performance  of  all 
known  designs  for  the  inverse  response  problem  when  a  is  not  near 
50%.  This  seems  to  have  been  due  to  the  bias  of  the  estimators  in  the 
small -sample  situation,  which  we  believe  is  due  in  turn  to  increased 
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nonlinearity  of  conventionally  uced  types  of  response  functions  a6  one 
leaves  the  neighborhood  of  a  =  .  5.  For  example,  |M"(x)|  for  the 
cumulative  normal  response  function  is  maximized  at  p.  ±  r  .  For  values 
even  further  out,  it  might  be  imagined  that  heteroscedasticity  would  have 
an  effect. 

Our  designs  for  a  =  .05  show  the  same  small-sample  inefficiency, 
but  we  do  not  conclude  that  this  necessarily  implies  that  the  designs  are 
unsatisfactory.  More  work  is  needed  on  the  effect  of  (1)  M'^x^), 

(2)  heteroscedasticity,  and  (3)  prior  information  on  the  minimum  vari¬ 
ance  which  can  be  reached  for  a  particular  sample  size. 
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APPENDIX  I 

METHOD  OF  REVERSALS 

The  method  of  reversals  was  first  proposed  by  Brunk,  Ayers,  Van 
Eeden  and  others  [12.  13]  .  The  method  is  based  only  on  the  assumption 
that  the  response  function  is  nondecreasing  with  increasing  stimulus 
level.  This  method  is  best  demonstrated  by  an  example  (see  [15]  for 
examples  and  uses  of  this  method): 


Stress 

! 

|  Responses/Trial's 

First  Attempt  | 

i 

i 

L  _  -  .  1 

Second  Attempt  j 

1 _ j 

Response 
Probability 
Estimate  s 

5.0 

2/3 

2/3 

2/3 

2/3 

3.7 

0/1 

J  2/5  - 

— 

5/12 

3.2 

2/4 

— 

5/12 

5/12 

1.9 

3/7 

3/7 

* 

5/12 

The  sample  response  fractions  are  first  arranged  in  order  of  increas* 
ing  stress.  Since  the  response  function  is  assumed  to  be  nondecreasing 
with  increasing  stress,  the  sample  response  fractions  may  be  used  as 
estimates  unless  they  violate  this  rule.  Whenever  such  a  violation  occurs 
on  consecutive  stress  levels,  an  attempt  iB  made  to  correct  the  situation 
by  merging  the  two  response  fractions.  In  the  example,  the  fractions  0/1 
and  2/4  violate  this  rule,  and  are  therefore  merged  to  give  2/5.  The 
other  response  fractions  remain  the  same.  At  this  point  3/7  and  2/5  are 
a  violation,  and  are  merged  to  get  5/12.  The  result  is  now  satisfactory. 

No  matter  what  order  the  violations  are  corrected,  it  can  be  shown  that 
the  final  estimates  are  the  unique  maximum  likelihood  estimates. 

Since  we  need  it  in  the  test,  let  us  define  a  "plateau"  as  an  ordinate 
on  the  partially  estimated  response  function.  In  the  above  example  there 
are  two  plateaus, 
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APPENDIX  II 

GENERAL  CRITERIA  FOR  EVALUATING  DESIGNS  FOR 
THE  INVERSE  RESPONSE  PROBLEM 

D.  Rothman 

The  estimation  of  the  abscissa  at  which  a  nondecreasing  mean 

response  function  M(x)  takes  on  a  specified  ordinate  a  may  be  called 
the  distribution-free  inverse  response  problem.  A  judiciously  chosen 
experimental  design  for  this  problem  would  very  likely  enjoy  certain 
common  properties  independent  oi  considerations  due  to  the  intended 
sample  size,  the  domain  of  allowable  test  levels,  the  desired  response 
fraction,  the  technique  of  analyzing  the  data,  and  the  extent  to  which 
blocking  is  required.  One  would  also  expect  that  designs  which  lacked 
some  of  these  characteristics ,  but  were  otherwise  excellent,  could  be 
easily  modified  to  conform,  and  would  thereby  be  slightly  improved. 

The  properties  are: 

1.  The  design  is  as  sequential  as  possible,  in  that  as  much  as 
possible  of  the  past  data  is  utilized  at  each  step  to  plan. the  next  test 
level,  or  block  of  test  levels, 

2.  The  stress  levels  in  a  test  block  average  the  same  or  less  (more) 
than  the  stress  levels  in  the  previous  block  if  the  average  response  in 
that  previous  block  was  greater  (less)  than  the  desired  response  fraction, 

3.  The  test  levels  converge  as  rapidly  as  possible  to  x  or  to 
some  minimal  set  in  the  test  level  domain  spanning  x^, 

4.  The  sample  response  fraction  converges  to  o,  and 

5.  The  spacing  of  the  early  test  levels  takes  into  consideration  the 
prior  density  on  x  . 

Q 

Let  us  discuss  these  characteristics  in  detail. 

1.  The  design  should  be  as  sequential  as  possible.  A  purely  sequen¬ 
tial  design  would  be  one  in  which  each  test  level  is  not  chosen  until  all 
previous  data  have  been  carefully  analyzed.  The  reason  for  this  is  that 


:  r:‘  ,! 


ib2  |).  ,  i  ui  Ivxpr  rime  lit* 

the  design  must  be  able  to  correct  itseJl  if  it  has  bn  i,  listing  in  the  wrong 
region  due  to  a  poorly  i  hosen  initial  test  level,  Fur  example,  a  Robbins- 
Monro  process  which  starts  out  with  too  small  a  step  si/e  and  a  bad  first 
guess  is  very  poor  for  small  samples,  and  there  is  no  mechanism  for 
altering  the  design  after  a  few  results  have  been  observed. 

A  maximum  likelihood  technique  for  such  data  analysis  in  the  distri¬ 
bution-free  case  which  can  be  used  with  any  sequenti.il  design  is  the 
"method  of  reversals"  discussed  in  Appendix  I. 

In  practice  such  an  analysis  may  not  be  feasible,  since  the  results 
of  all  previous  tests  may  not  be  available  when  the  new  test  is  planned, 
or  there  may  not  be  time  for  the  calculations.  Nevertheless,  as  much 
data  as  are  available  should  be  analyzed,  and  it  would  lie  hard  to  beat  the 
method  of  reversals  for  simplicity.  We  know  of  no  design  presently  used 
which  obeys  this  precept,  and  we  feel  that  this  is  really  a  serious  defect. 
Both  of  our  new  designs  were  conceived  to  meet  this  need. 

2.  The  stress  levels  in  a  test  block  should  average  the  same  or  less 
(more)  than  the  stress  levels  in  the  previous  block  if  the  average  response 
in  that  previous  block  was  greater  (less)  than  the  desired  response 
fraction,  a, 

For  purely  sequential  designs  this  condition  implies  that  the  test 
level  alter  a  "1"  will  be  at  an  equal  or  lower  level,  and  the  test  level 
after  a  "0"  will  be  at  an  equal  or  higher  level.  The  up-and-down  design 
and  the  stochastic  approximations  all  follow  this  rule,  whereas  the 
Dorman  design  does  not. 

For  extremely  small  values  of  a  one  would  not  be  too  fussy  in 
demanding  that  the  test  after  a  "0"  be  at  a  higher  level.  In  practice  the 
test  efficiency  is  relatively  insensitive  to  the  location  o:  the  test  follow¬ 
ing  a  "0".  This  is  why  it  is  possible  to  violate  this  rule  in  the  Alexander 
design  tor  a  =  5%.  A  similar  observation  could  bo  made  for  high  values 
of  n . 
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dense  in  a  neighborhood  of  x^,  the  design  should  actually  converge  to 
x  .  Such  a  design  is  called  a  stochastic  approximation  (of  xq),  and  an 
example  is  given  by  the  Robbins-Monro  process, 


L 


n+1 


L 

n 


+  c  (a-R  ) 
n  n 


where  L  denotes  the  n**1  test  level,  R  denotes  the  test  result, 

n  n 

and  c  is  generally  of  the  form  c/(n+n  ).  It  has  been  conjectured  that 
n  o 

the  minimum  asymptotic  variance  for  such  designs,  and  for  the  general 
nonparametric  inverse  response  problem  for  quantal  data,  is  given  by 

V  .  (x  )  ~  a(l-a)/N[M'{x  )]  2 
min'  a.'  '  "  *  a/ 


where  M'(x)  denotes  the  derivative  of  the  response  function,  and  N 
denotes  sample  size,  M'(x)  should  be  continuous  at  x^,  and 

0  <  M'(xq)  <  «.  For  example,  if  a  =  ,  5  and  M(x)  is  cumulative  nor¬ 
mal,  then 


V  . 
min 


(x 


5)  (*/2)<r2/N  . 


Based  on  this  conjecture,  a  relative  asymptotic  efficiency  may  be  defined 
as  follows: 


e  ~  V  ,  (x  )/V(x  )  • 
min'  a  '  a ’ 


To  our  knowledge  this  conjecture  at  present  lacks  proof,  but  may 
be  justified  as  follows; 

a.  The  R-M  process  can  match  this  asymptotic  variance  for  the 
right  choice  of  c^,  namely,  c^  =  l/nM^x^). 

b.  When  the  response  function  is  known  to  be  cumulative  normal 
and  when  the  optimal  design  still  turns  out  to  be  a  stochastic 
approximation  of  x^  (as  in  Chernoff  [14]  ),  the  variance  is 
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equal  to  the  expression  above,  thus  making  it  plausible  to 
conclude  that  we  can  generally  do  no  better. 

It  is  intended  that  the  new  designs  satisfy  this  rule.  The  up-and- 
down,  Langlie,  and  Derman  designs  do  not.  It  should  be  pointed  out 
however  that  the  first  two  of  these  were  intended  only  for  the  cumulative 
normal  inverse  response  problem. 

4.  The  sample  response  fraction  a  should  converge  to  a. 

r 

If  the  design  is  a  stochastic  approximation,  and  if  M(  x)  is  continu¬ 
ous  at  x  ,  then  this  property  will  hold.  Of  the  allowable  test  levels  arc 
a 

discrete,  then  this  rule  gives  the  asymptotic  percentage  at  each  of  the 
two  levels  that  the  design  converges  to. 

One  might  deduce  from  this  a  principle  of  "compensation":  If  a  >  a, 
take  the  next  test  at  a  level  under  the  latest  estimate,  to  compensate  for 
the  lack  of  0's.  A  similar  statement  could  be  made  for  a  <  a.  The 
Rothman  design  does  this  explicitly. 

5.  The  spacing  of  the  early  test  levels  should  take  into  considera¬ 
tion  the  prior  density  on  x^. 

Let  w  denote  the  known  standard  deviation  of  the  (normal)  prior 
density  on  x  ,  L,  denote  the  i^  test  level,  R.  denote  the  i4*1  response, 

’  Q  1  _  1  r 

and  let  g  =  wM^x^)/ "\/a(l-a).  Then  the  situation  g  >  >  1  corresponds  to 

the  Marks  problem  of  locating  a  step;  the  situation  g  <  <  1  would  permit 
vis  to  imagine  that  we  are  merely  continuing  a  design  which  had  already 
gone  quite  far. 

Then  the  above  property  has  the  following  ramifications; 

a.  The  quantity  |  L  -L  |  should  be  close  to  1.17w  (as  in  the  Marks 
design)  for  g  >  >  1,  and  close  to  g  |a-R^  |  /M^x^)  for  g  <  <  1 
(c.  f.  L^-L^  =  (a-RjJ/M^x  )  for  the  Robbins-Monro  process). 

b.  If  R  =  R  ,  then 

£  1 

I  VLil//2  i  lL3'L2!  <  I  LZ-L1  i  • 


I 

f 


t 


P)  o  «;  i  on  of  r  vnn  *•  >  r»  r»  #•  c 

D  1  "  “ . . 


If  g  >  >  1,  the  lower  bound  is  more  useful,  as  in  the  Marks  design. 
If  g  <  <  1,  the  upper  bound  is  more  appropriate ,as  in  the  delayed 
R-M  process.  The  conventional  R-M  process, 

L„+i “  L„  +  Vc/n  ■ 

violates  this  precept. 

c,  If  /  R,  then  |  L^L^  | /|  L^L^  |  should  be  close  to  l/2  (as 

in  the  Marks  design)  for  g  >  >  1,  and  close  to  la-R^  I  / 1  I 

for  g  <  <  1. 

The  big  question  here  is  the  quantity  w.  If  the  prior  density  is  uniform 
with  range  D,  then  the  Marks  design  would  change.  Nevertheless,  the 
above  rules  with 


w  =  D/VI2 

should  still  be  useful  guidelines. 

Often  one  is  testing  a  population  similar  to  populations  previously 
tested  in  the  past,  differing  perhaps  only  because  of  small  changes  in 
chemical  formulation  or  test  equipment.  In  this  case  the  distribution  of 
past  estimates  of  is  just  the  "prior  density"  we  are  using. 

If  w  itself  is  extremely  uncertain,  the  experimenter  should  use  a 
high  value  as  a  precautionary  measure. 
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DISTRIBUTIONS  OF  DIXON'S  CRITERIA  FOR  TESTING 
OUTLYING  OBSERVATIONS 

W  al*  1  L  Mowchan 

Surveillance  Brc  Ballistic  Research  Laboratories, 
Aberdeen  Proving  Ground,  Maryland 


ABSTRACT .  An  empirical  or  Monte  Carlo  method  for  determining 
the  distribution  of  Dixon-type  sample  statistics  for  testing  outlying 
observations  is  presented.  Results  are  presented  for  samples  generated 
from  a  normal  distribution  and  for  samples  generated  from  a  uniform 
distribution.  The  method  employed  was  to  select  random  samples  of 
sizes  n  =  5,  10,  15,  and  20  from  each  of  the  aforementioned  distributions. 
After  ordering  the  sample  values  such  that  X  <  X  .  .  .  <  X  ,  the  six 

different  statistics  (defined  later)  for  each  sample  size  were  computed 
for  each  distribution.  A  sampling  distribution  was  therefore  obtained 
empirically  for  each  sample  size  for  each  distribution  after  500  such 
sample  trials.  The  cumulative  frequency  functions  were  then  plotted 
for  both  the  normal  and  the  uniform  distributions.  With  respect  to  the 
normal  distribution,  these  results  can  be  compared  with  theoretical 
values  which  are  published  in  tabular  form  by  W.  J,  Dixon  [1  }  With 
respect  to  the  uniform  distribution,  two  contributions  are  made  to  the 
statistical  literature: 

1.  A  procedure  for  detecting  outlying  observations  in  samples 
from  a  uniform  distribution  is  presented. 

2.  A  comparison  of  the  cumulative  frequency  functions  indicates 
that  all  extreme  values,  except  for  sample  size  n  =  5, 
rejected  under  the  assumption  of  normality  would  also  be 
rejected  if  the  actual  data  were  in  fact  selected  from  a 
uniform  distribution,  since  the  upper  percentage  points  for 
the  normal  distribution  are  higher  than  for  the  uniform 
distribution. 

Finally,  a  comparison  of  the  two  cumulative  distribution  functions 
indicates  which  statistics  of  the  six  presented  are  best  suitable  for 
checking  an  extreme  value  given  a  certain  sample  size. 
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1.  INTRODUCTION. 

1.1  Definitions  of  Statistics  to  be  Investigated 

Six  statistics  proposed  by  Dixon  [9]  for  testing  the  significance  of 
outliers  are  presented.  The  author  has  attempted  to  obtain  the 
probability  distributions  of  these  statistics  by  the  Monte  Carlo  method 
of  sampling  on  an  electronic  computer.  Let  us  consider  n  observations 
of  a  sample  from  Normal  and  Uniform  distributions  such  that 
Xj  <  <  .  .  ,  <  X^,  where  X^  is  the  suspect  outlier.  Since  both 

the  normal  and  the  uniform  distribution  are  symmetric,  we  could  also 
have  considered  the  largest  observation,  X  ,  to  be  a  suspected  outlier. 
Of  course,  it  is  easy  to  observe  that  for  any  of  the  six  statistics  the 
sampling  distribution  of  the  smallest  value,  X^,  would  be  equivalent 

to  the  sampling  distribution  of  the  largest  value,  X  ,  except  for 
location  or  mean,  .  n 

For  definition  purposes,  let  us  consider  the  following  six  statistics 
which  will  be  investigated  in  detail: 

1.  For  a  single  outlier,  X^ 


2.  For  a  single  outlier  X^  avoiding  X^ 


3.  For  a  single  outlier  X,  avoiding  X  ,  X  , 
8  1  8  n  n-1 

X2'X1 


12  X  ,  -  X, 
n-2  1 


■  I.  ii 


12.  Brief  Historical  Background 

The  testing  of  extreme  values  is  a  very  old  problem  in  applied 
statistics.  The  data  obtained  in  experimentation  must  be  carefully 
examined  so  that  one  can  be  reasonably  certain  that  the  results  of 
sampling  are  representative  of  the  process.  It  is  quite  obvious  that 
rejection  (or  acceptance)  of  outliers  could  lead  to  a  much  different 
course  of  action  than  otherwise  taken.  It  should  be  noted  that  in  some 
cases  the  problem  of  outliers  may  depend  on  common  sense  and  hence 
may  be  a  practical  problem  as  well  as  a  statistical  problem.  A  review 
of  the  literature  indicates  that  the  problem  of  outliers  received  much 
attention  prior  to  '940.  In  fact  explanations  concerning  outliers  were 
presented  as  early  as  1850  by  W.  Chauvenet  [2],  His  hypothesis 
basically  stated  that  some  samples  contained  a  very  small  portion 
of  observations  from  a  population  with  a  different  mean  value. 

P.  R.  Rider  [3],  for  example,  proposed  a  solution  based  on  the 
assumption  that  the  population  standard  deviation,  <r  ,  be  accurately 
known.  In  a  similar  manner,  J.  O,  Irwin  [4  ]  pub1  ished  in  1925 
criteria  based  on  the  difference  of  the  first  and  second  (ranked) 
observations  and  on  the  difference  of  the  second  and  third  (ranked) 
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observations  in  a  random  sample  from  a  normal  population.  Another 
very  practical  approach  was  presented  in  1935  by  McKay  [5  ]who 
published  a  paper  on  the  difference  between  an  extreme  observation 
and  the  sample  mean.  'In  conjunction  with  his  work,  K.  R,  Nair  [6]  in 
1946  tabulated  the  distribution  of  the  difference  between  an  extreme 
observation  and  the  sample  mean  for  small  sample  sizes.  W.  R, 
Thompson  [7]  in  1935  working  on  the  assumption  that  the  standard  devia¬ 
tion  was  not  known  presented  a  paper,  "On  a  Criterion  for  the  Rejection 
of  Observations  and  the  Distribution  of  the  Ratio  of  the  Deviation  to  the 
Sample  Standard  Deviation.  "  One  interesting  fact  concerning  Thompson'9 
work  is  that  he  presented  an  exact  test  for  the  hypothesis  that  all  sample 
observations  were  (rom  the  same  normal  population.  Another  significant 
contribution  is  presented  by  Grubbs  [8]  whose  criteria  are  based  on  the 
sample  sum  of  squared  deviations  from  the  mean  for  all  observations 
as  compared  to  the  sum  of  the  squared  deviations  omitting  the  "outlier". 

W .  J.  Dixon  [9}  in  1950  presented  a  paper  based  on  sample  ranges  and 
subranges.  His  paper  assumes  that  the  random  samples  are  drawn 
from  a  normal  population.  In  connection  with  this,  Dixon  and  Massey 
[10]  proposed  a  method  for  estimating  the  mean  and  standard  deviation 
when  the  effect  of  outliers  (light,  medium  or  heavy)  is  known.  This 
paper  is  concerned  primarily  with  the  statistics  presented  by  Dixon  [9] 
in  1950  since  for  practical  purposes  they  are  very  easy  to  compute. 

In  addition,  one  would  like  to  know  how  much  non-normality  would 
affect  the  tests  and  this  is  also  studied.  As  an  example,  this  paper 
attempts  to  develop  empirically  how  sample  criteria  for  non-normal 
distribution  (the  uniform)  compares  to  that  for  the  normal  distribution 
when  various  tests  for  suspected  outliers  are  performed.  As  already 
mentioned,  one  of  the  primary  reasons  for  selecting  Dixon's  criteria  is 
that  the  statistics  presented  are  very  easy  to  compute. 

1.  3  Monte  Carlo  Method 

With  the  aid  of  high  speed  electronic  computers  such  as  BRLESC 

(Ballistic  Research  Laboratories  Electronic  Scientific  Computer)  at 

Aberdeen  Proving  Ground,  Maryland,  a  program  was  available  to  obtain 

random  numbers  with  frequencies  equal  to  those  of  the  uniform  or  normal 

distribution.  In  order  to  generate  random  numbers  for  both  the  uniform 

and  the  normal  distributions,  it  was  necessary  therefore  only  to  enter 

a  subroutine  already  on  tape.  Basically,  the  subroutine  works  as  follows 

for  the  uniform  distribution.  An  initial  value,  X  ,  (.  547812619135913) 

o 


D  n  ei  f  IT  n 

- e>--  - - -♦*--' 


J/l 


is  selected  and  multiplied  by  a  "K"  factor  which  is  always  5  x  2 
The  last  fourteen  digits  are  then  preceeded  by  a  decimal  point  so  that 
the  number  lies  between  0  and  1.  The  is  then  used  to  generate 

in  an  identical  manner  and  the  process  is  continued  until  the  n  random 
numbers  desired  are  generated. 


In  order  to  generate  numbers  which  follow  the  normal  distribution, 
i.  e.  ,  N(0,  1)  a  very  similar  procedure  is  employed.  The  computer 
first  selects  64  random  numbers  from  the  uniform  distribution  and 
computes  the  mean,  X  .  One-half  is  then  subtracted  from  the  mean  X^ 

and  the  whole  quantity  is  multiplied  by  16/3.  Therefore  the  first  random 
normal  observation  X^  would  be  16/3(X^  -  .  5).  For  the  second  random 

normal  number,  the  computer  again  selects  64  random  uniform  numbers 
and  follows  exactly  the  same  process  until  n  observations  are  generated. 
Since  the  X,  (i  =  1,  2,  3,  .  .  .  n)  are  obtained  from  a  uniform  distribu¬ 
tion,  it  can  1easil^r_be  shown  by  use  of  the  well  known  central  limit 
theorum  [ll]  that  X  is  approximately  N(l/2,  1/768).  Therefore,  it  is 
obvious  that  (X  -  .  5)  /  1/16V 3  is  approximately  normally  distributed 
with  mean  0  and  variance  1.  These  routines  have  been  checked  by^2 
for  Normality  and  Uniformity  and  the  results  are  contained  in  BRL 
Report  No.  855  dated  May  1953  [12]  .  Incidentally,  the  periodicity  of 
the  subroutine  is  one  in  every  four  million  computing  years. 

In  order  to  obtain  a  sampling  distribution  for  each  of  the  previously 
mentioned  six  statistics  for  each  distribution,  it  was  de  -ided  that  500 
trials  might  be  acceptable.  For  example,  for  sample  size  n  =  5  from 
the  uniform  distribution,  r^  =  (X^  -  X^)/(Xg  -  X^)  was  computed  from 

500  trials  and  the  observed  cumulative  distribution  was  plotted.  Like¬ 
wise,  this  same  general  procedure  was  used  to  obtain  for  the 

normal  distribution.  Since  Dixon  [l ]  has  already  published  tabular 
results  based  on  an  analytical  function  of  the  distribution  for  r^Q  for  the 

normal  universe,  it  is  of  primary  interest  to  compare  his  analytical 
function  with  both  the  uniform  and  normal  distributions  which  were 
derived  in  this  work  empirically  by  Monte  Carlo  techniques.  These 
results  (see  Appendices  I  and  II)  are  tabulated  and  plotted  for  each  of 
the  six  statistics  for  each  of  the  sample  sizes  n  =  5,  10,  15  and  20. 
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2.  MONTE  CARLO  NORMAL  VERSUS  THEORETICAL  NORMAL. 

The  general  contribution  of  Dixon  [l]  was  to  obtain  analytical  results 
based  on  small  sample  sizes  for  the  distributions  of  the  six  previously 
mentioned  statistics.  Percentage  points  were  then  obtained  by  numerical 
integration  for  various  sample  sizes  from  n  =  5  to  n  =  30, 

As  an  example  let  us  consider  r,  =  (X  -  X  ,)/(X  -  X.)  where  the 

r  10  '  n  n-1'  v  n  1 

subscripts  on  the  X's  indicate  ordered  values  such  that  X^  <  X^  <  .  .  .<  X^. 

Dixon  [l]  indicates  the  density  function  for  X. ,  X  X  to  be 

'  1  n-1  n 


A  j 

<*>  TSH5T  1<xi>  d*i ( I  ‘(')d»”‘3  f(*n>  d*n  • 


If  we  let  v  s  x  -  x, ,  rv  ■  x  -  x  , ,  x  =  x  and  integrate  x  and  v  over 
n  1  n  n-l  n 

their  range  of  definition  we  get  the  density  of  (v,  r,  x)  to  be 


e*  <-»x*rv  n-3 

f(t)dt]  f(x-v)  f(x-rv)  f(x)  v  dv  dx 

0  “x-v 


where  -w  <  x  <  *  and  0  <  v  <  ».  Also  let  f(t)  =  (1/V2tt)  e 


l/z 


Let  us  now  consider  a  specific  case  where  n  =  3.  Formula  (b)  now 
appears  as 

,  v2  ,  ,2  2 

,,  f-  f-  f»-rv  3-3  , 

<‘>M  3  <3 

-  0  x- v 


(2ir) 
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and  collecting  terms  we  get  the  density  function  to  be 


v 
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Upon  completing  the  square  we  get 
2 


37  3 


( 


3  P"  "tV  f(l+r2)  •  x/3  (l+2r+r2)l  ] 

J  v  7T  y . 


V  2tt  (l/V  3) 


-«/2 

•  e  ' 


dx  dv 


which  can  easily  be  integrated  to  obtain 
3/  3 


(f)  f(r)  = 


2ir(l-r+r  ) 


Integration  of  the  density  function  results  in  the  cumulative  density 
function  (cdf)  which  is  expressed  as 


(g)  F(R)  =  1  arc  tan  [  l  (R10  -  1/2)]  +1/2 


and  upon  setting  (g)  equal  to  1  -  a,  we  can  easily  obtain 


(h) 


R10  " 
a 


V*3 


V3 

y  tan  tr/3  (l/2  -  a)  +  1/2  where  is  the  upper  a 


probability  level  or  percentage  point. 

In  comparison,  the  Monte  Carlo  distribution  (based  on  a  sample 
of  500  trials)  for  r^  for  sample  size  n  »  5  agrees  very  well  with  the 

analytical  function  derived  by  Dixon  for  n  s  5,  In  general,  the  six 
statistics  for  sample  sizes  n  =  5,  10,  15,  and  20  agree  quits  well  with 
Dixon's  results  •  particularly  for  the  upper  percentage  points.  A 

goodness  of  fit  test  indicated  that  the  percentage  points  for  the 
Monte  Carlo  method  of  sampling  did  not  differ  significantly  (.05  level 
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of  significance)  from  those  derived  theoretically  by  Dixon;  however,  it 
is  strongly  recommended  that  in  future  work  more  than  500  trials  be 
utilised  in  order  that  more  accuracy  may  be  obtained  by  Monte  Carlo 
methods. 

Let  us  derive  the  density  function,  the  cdf  and  the  upper  a  probability 
level  for  the  uniform  distribution  for  the  statistic,  r^.  As  indicated  in 

(a)  and  (b)  earlier,  we  can  write  the  density  of  (v,  r,  x)  to  be 


(i) 


I  r\00  s*  00  p  X-rV  . 

r~)T  J  \  (jx_v  f(t)dt)n'  f(x-v)  f(x-rv)  f(x)  v  dv  dx 

-  OC)  0 


and  let  f(t)  =  l/b-a  where  a  <  x  <  b  which  readily  gives  us 
dj  r  x-a  ,,  v  n- 3 


«  wk  Ho '  ^ 


(b-a)' 


dv  dx 


Upon  performing  the  integration  in  (j)  we  get  the  density  function 

(k)  f(r)  =  (n-2)(l-r)n*3  where  0  <  r10  <  1 

n  >3 

Integration  of  the  density  function  results  in  the  cdf  which  is  expressed 

as 

(l)  F(R)  s  l  •  (  1  •  R^)0 an<*  uPon  Bettin8  (1)  equal  to  1-  a  we 
obtain 

l/n-2 

(m)  s  1  >  a  where  is  the  upper  a  probability  level 

a  a 

or  percentage  point.  These  theoretical  results  are  compared  with  the 
Monte  Carlo  results  and  are  contained  in  Appendix  No.  1. 

Another  point  of  interest  is  that  the  Monte  Carlo  results  for  the 
uniform  distribution  were  significantly  different  from  the  (theoretical) 
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normal  when  the  X  goodness  of  fit  test  was  applied  at  the  .  05  level  of 
significance  as  might  have  been  suspected,  This  would  indicate  that 
Dixon's  criteria  are  rather  sensitive  to  departures  from  a  normal  universe. 

3.  COMPARISON  OF  NORMAL  AND  UNIFORM  DISTRIBUTION. 

3.1  Sampling  from  a  Normal  Distribution. 

As  previously  mentioned,  the  Monte  Carlo  method  based  on  a  sample 
of  500  trails  did  not  differ  significantly  from  the  analytical  method 
developed  by  Dixon  for  Normal  distributions.  If  one  assumed  that  he 
were  sampling  from  a  normal  distribution,  it  can  be  seen  that  all  extreme 
values  rejected  under  the  assumption  of  normality  (with  the  exception  of 
n  =  5)  would  also  be  rejected  if  in  fact  the  actual  distribution  sampled 
were  uniform.  (See  Appendix  No.  I) 

3.2  Sampling  from  a  Uniform  Distribution. 

If  one  assumed  that  he  were  sampling  from  a  uniform  distribution, 
then  many  extreme  values  rejected  values  under  the  assumption  of 
sampling  a  uniform  distribution  would  be  wrongly  rejected  if  in  fact 
the  actual  distribution  sampled  were  normal.  (See  Appendix  No.  I) 

Hence,  the  error  involved  in  3.1  would  probably  be  less  serious  than 
the  error  involved  in  3.2. 

4.  AN  EXAMPLE. 

This  section  will  seive  to  illustrate  the  use  of  Dixon's  criteria 
for  determining  whether  a  doubtful  observation  is  to  be  retained  or 
rejected.  One  of  the  classic  examples  consists  of  a  sample  of  fifteen 
observations  of  the  vertical  semi-diameters  of  Venus  made  by  a 
Lieutenant  Herndon  in  1846  and  presented  by  Chauvenet  (2).  In  the 
analysis  of  the  data  which  followed  the  following  fifteen  residuals  were 
obtained  and  have  been  arranged  in  ascending  order  of  magnitude 
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The  residuals  -1.40  (Xj)  and  1.  01  (X^,.)  appear  to  be  questionable. 

Here  the  suspect  outliers  lie  at  each  end  of  the  sample.  Since  no 
optimum  procedure  for  testing  outliers  at  both  ends  of  the  sample  is 
currently  available  unless  the  population  variance,  cr^  ,  is  known,  we 
shall  now  illustrate  the  simplicity  or  ease  at  which  Dixon'B  statistics 
may  be  computed.  Let  us  first  test  the  observation  -1.40  since  it  is  most 
distant  from  the  mean  of  the  sample.  Also,  we  shall  select  a  =  .  C5  which 
means  that  Pr  (*22  >  R)  =  •  05.  For  sample  size  n  =  15,  we  get: 

X3  ~  X1  _  -.  30  +  1.40  _  1.10  _ 

r22  ‘  Xn  -  X1  =  .48  +  1.40  =  1.88  ' 


Since  the  calculated  value  of  .  585  is  greater  than  the  critical  value 
of  ,  525,  we  reject  the  observation  -1.40  by  Dixon's  test  and  now  proceed 
to  check  the  observation  1.  01  for  sample  size  n  =  14. 


14 

14 


1.  01  -  .48  .  53 

1.01  +  .24  =  1.25 


.425 


Since  the  calculated  value  of  .425  is  less  than  the  critical  value 
of  .  546,  we  accept  the  observaticn  1.  01  by  Dixon's  test  and  no  other 
values  would  be  tested  in  this  sample. 

5.  CONCLUSIONS. 

5.1  Extension  of  Tables  Based  on  the  Normal  Distribution 

Since  the  Monte  Carlo  Normal  Distribution  can  be  used  to  represent 
the  analytical  solution  presented  by  W.  J.  Dixon,  it  is  therefore  possible 
to  extend  these  tables  (See  Appendix  1)  to  sample  sizes  for  larger  values 
of  n,  which  in  many  cases  would  be  of  considerable  interest  in  applied 
statistics. 

5.  2  Development  of  Criteria  Based  on  the  Uniform  Distribution 

The  Monte  Carlo  uniform  distribution  can  be  employed  to  develop 
a  criteria  for  the  rejection  of  extreme  values  based  on  the  assumption 
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of  sampling  a  uniform  distribution.  Thus,  the  tables  and  figures  pre¬ 
sented  in  Appendices  I  and  II  may  be  of  significant  importance  in  many 
practical  situations  where  the  actual  distribution  is  in  fact  uniform. 

5.  3  Choice  of  Statistics 

The  cumulative  distribution  functions  plotted  in  Appendix  II  provide 
very  helpful  information  regarding  which  statistic  should  or  should  not 
be  used  given  a  certain  distribution  and  a  certain  sample  size.  For 
example,  if  given  the  normal  distribution,  the  statistic  r^  appears  to 

perform  very  well  for  small  sample  sizes  such  as  n  =  5  while  it  is 
obvious  that  the  statistics  r^  and  r^'would  not  provide  a  good  test  for 

these  small  samples  because  of  the  slope  of  the  curves. 

5.4  Additional  Comments 

In  this  paper,  I  have  attempted  to  show  that  Dixon's  [1]  criteria 
for  testing  of  extreme  values  based  on  the  assumption  of  normality 
can  be  established  empirically.  Also,  I  have  attempted  to  show  what 
would  happen  if  the  distribution  sampled  were  in  fact  uniformly  distri¬ 
buted. 

Since  analytical  or  theoretical  functions  for  testing  outlying 
observations  generally  become  quite  involved,  further  work  involving 
the  effect  of  skewed  distributions  such  as  some  of  the  Pearson  Type 
curves  [ll]  could  be  accomplished  by  Monte  Carlo  methods  on  a  high 
speed  computer. 

It  would  also  be  of  interest  to  develop  a  two  sided  test  for  examining 
extreme  values  from  a  sample.  In  this  connection  it  is  suggested 
that  the  sample  observations  be  arranged  such  that  X  <  X  <  .  .  .  <  X  . 

A  proposal  for  the  two  sided  test  would  be  to  first  let  X^  be  the 
suspected  outlier  (Dixon's  approach)  and  then  to  compute  the  desired 
statistic.  Next,  from  the  same  sample,  let  X  be  the  suspected  outlier 
and  again  compute  the  statistic.  The  higher  o? the  two  values  obtained 
would  then  be  chosen.  If  this  procedure  were  repeated  at  least  500 
times,  then  a  two  sided  test  could  be  developed  empirically  for  testing 
extreme  values  and  this  might  have  rather  wide  application.  Again,  it 
is  once  more  repeated  that  at  least  500  trials  should  be  used. 
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Finally,  Appendices  III  and  IV  contain  machine  programming  data 
which  could  easily  be  used  for  obtaining  Monte  Carlo  distributions  of 
Dixon's  statistics  [l]  based  on  the  assumption  of  uniformity  or  normality, 
if  sample  sizes  of  greater  than  20  are  desired. 
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appendix  I 

Ja»e8/f  Upper  Percentage  Point#  for 
the  Uniform  and  the  Normal  Distribution 


Table  cf  the  Upper  Percentage  Points  for  the  Uniform  Distribution 


Pr(r^>  B)  where  J  ■  0,1,2 


Sample 

o  — i 

Of  a 

.01 

a  m 

.02 

a  *  .OS 

SIM 

Statistic 

'l1* 

T* 

TJ** 

T* 

u** 

r10 

.829 

755 T 

.785 

.775 

.729 

.716 

.631 

75&5“ 

n-5 

rll 

.928 

.918 

.900 

.886 

.859 

.831 

.776 

.736 

rl2 

•995 

•996 

.990 

.990 

.980 

.980 

.950 

•  946 

rio 

7555 

.490 

7555“ 

.440 

755T 

.391 

•  312 

.313 

n-10 

rll 

■531 

.535 

.482 

.484 

.428 

.427 

.347 

.349 

r12 

.586 

,588 

.536 

.541 

.479 

.476 

.393 

.391 

rio 

.335 

.298 

.300 

.260 

.262“ 

.206 

.209 

n-15 

ril 

.357 

.356 

.319 

.323 

.278 

.279 

.221 

.221 

r12 

.382 

.381 

.342 

.345 

•299 

•298 

.238 

.255 

rio 

.255 

.250 

.225 

.221 

•  195 

.193 

.153 

7I5T 

n-20 

rll 

.268 

.264 

.237 

.231 

.206 

.197 

.162 

.153 

r12 

.282 

.281 

.250 

.253 

•235 

.230 

.171 

.166 

'"Upper  percentage  points  based  On  Dixon's  Uniform  (theoretical) • 
♦"Upper  percentage  points  baaed  on  Monte  Carlo  Uniform. 


Table  of  Upper  Percentage  Points  for  the  Normal  Distribution 


Sample 

Size 


Pr  (r  sfi)  =  a  where  i  =  1,  2  and  J  »  0,  1,  2 

i  ~  oos  I  a  =  .01  I  a  -  02~1 


a  = 

005 

a 

T* 

N** 

1* 

.821 

.822 

.780 

•  93' 

•  939 

916 

a  ■  02 

T*  IN** 


“W 

“7i r 

.876 

.882 

984 

.985 

.901 

.901 

.990 

•  992 

.453 
•  551 
.610 

11B 

.592 

vn 

\o 

VO 

.678 

.671 

.749 

.742 

.574 
.645  I  .616 


300 

554  .358 
358  .358 

372  .375 


.506  .507 

.535  .5301  .502  1  503 


•Upper  percentage  points  based  on  Dixon's  Normal  (theoretical) 
••Upper  percentage  points  based  on  Monte  Carlo  Normal. 
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APPENDIX  II 

Figure*  of  Cumulative  Distribution* 


figure  2  -  Graph  at  r  for  eaap le  size 
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Graph  of  r  for  saaple  size  n-10 
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Figure  ill  -  Graph  of  r  _  for  maple  size  n=15 


Figure  17  -  Graph  of  r  for  sasple 
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Figure  18  -  Graph  of  rlf)  for  sample  size 
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PR08W-OS9  DISTRIBUTION  SKECT  W.  MCrGMAN 


10  MAT 


BLOC ( N 1-N3Q 1U1-U30 ) 

BLOC (Q1SI-D1S18)D1-0144G ) 

BLOC ITl-TiO) S1-S30 ) SUM  l-SUMS  ) 

F2  FORM ( 9“ 10 1 9- 10 ) 4-10)4-10) 4-5)5-l“5) 

START  VO*. 024928455X  20—.0002 134656U 

CLEAR  1 1440 )N0S.AT(D1)X 

setiss-sin-soou  fss-s*  oes-ox  setihax-oix 
3.0  $£TU*0>* 

3.1  S1.1-1X 

COUNT ( 30 1  INI  1 1 GOTO! 3. lit 
1.0  SET ( I *0 1  L  ENTER  1 2ER0CC ) t 

SET!M1*0)PU*0)X 
IF (0ES*0 )GOTOI 1 .21* 

1.1  ENTER<URN0S2)UKN)X  G0T0(1.3)X 

1.2  ENTER (UKNOSl )UAN)X 

1.3  U1 t I "URN1  T1»1*14.5«UKN-4.5X 
C0UNT(SS)IN(  1  (GOTO (1.1)% 

4.0  SET  C I >0  )  X 

IF (DES»0)G0T0(4.  1) *  G0T014.2)* 

4.1  ENTEMNRNOS’. )S1«I)1)N1«I)X  UES-1X  GOTOI4.3)t 

'  4.2  ENTERINKN0S21SI, 1) 1 INI » US 

4.3  COUNT(SS) IN! I)GOTO!4.2)X 

9.0  SETICT-dX 

SETIM-N1U 

10.0  INTII-M41IX  INT!K»M)X  SET! X*0 ) Z*0 ) X 

11.0  IF ( |K<» l (GOTO ( 12 .0 ) A 

T1*iK1  , K*  .  1 X  tl*TH 

12.0  COUNT ( SS-1 ) IN t  X ) GOTO ( 13 « 0) X  G0T0I14.OI* 

13.0  INC(IM*1)X  GOTO  1 1 1 . 0  I X 

14.0  COUNT 1  SS-1 ) IN! Z ) GOTO ( 15 .0 )  X  GOTO  1 16.0 ) X 

15.0  INT!K*M*Z)X  INTU-K+llX  INTIX-ZJX 

GOTO!  11.  OU 

16.0  INC|CT*CT*l)X 

IF-1NTICT*1)GOTOU6.1)X 
GOTO (  17. 0) X 

16.1  SET(M«U1)X  GOTO!  10.0)  t 

17.0  SET 4 C T«  1 1  X  SET(M*N1U  ENTER! 2ER0CC > X 

17.1  INTI I«M+SS“1 )% 

18.0  Tl*i ( N*1 )-»MX  T2*i (M*2 )-.MX 

T3-*I-.MX  T4*,< l-l)-,MX  T5-, 1 I-21-.MX 

16.1  Dl$l*Tl/T3X  01S2»Tl/T4X  01S3-TI/T5X 
UIS4-T2/T3X  0I$5»T2/T4X  0IS6-T2/T5X 
INCICT-CT4UX 

Tl*,l-,(l-l)X  T2-,  I-.l  I-2)X  T4»  t [ - » ( M+l ) X  T5»»  I-*  I  M*2 )  X 
DIS7-U/T3X  Ol$8»Tl/T4X  DIS9-T1/T5X 
0IS10»T2/T3i  0IS11-T2/T4X  D1S12-T2/T5X 
IF(01S1>DIS7)G0T0C2S.0)X  DIS13*0IS7X 
24.0  IF  !D1S2>DIS8)G0T0C25. 1)X  D1514-DIS8* 

24.1  IF(U153>01S9)GOTOI25.2)X  0 1 S 15-U 1 S9X 

24.2  IF(0tS4>01Sl0)G0rO(2b.3)t  01 S16-DI S104 

24.3  IFtOIS5>DISll)GOTO!25.4)X  01 S17-01 SI U 

24.4  IF(0IS6>0IS12)G0T0(25.5)X  01S18-DIS12X 

24.5  GOTO  <  TALLY  l )  ‘i 

25.0  DIS1J-DISU  GOTO  ( 24,0 )  '* 

25.1  0 1  S  14*01 S2X  C0TO124.1IX 

25.2  01S15-D1S3*  G0TC124.21X 

25.3  01 SlO»01 S4X  GOTO ( 24, 3 ) X 

25.4  01S17-DI55*  GOTO ( 24 .4  I X 

25.5  0IS10*1)1S64  GOTO  (  24. 5  I « 

TALLVl  SE  T  (  G«0  1 5!  SETIQ-ODi 


10  HAY  i)-j,  HAl.f- 
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i  i  i 


LL2 

U3 

LU 

LL5 

LL6 

OPEN 


TALLY2 

ZIP 

19.0 


19.2 
20.0 
26.0 
26.1 

26.2 


21.0 


22.0 


ri-om.r.x  I nti mi=o*p« u 
IF(T1*>.5)C0T0(LL2)X 
IF(Tl*>.25IGOTOILL3)t 
!F(TI*>.125>G0TC(IL4>S: 

T2  =  0i  T 3* . 02 5X  GOTOIOPENU 
IF(TI»>.75)G0T0<LL5I* 

INTIM1-MU2CU  T2=.5t  T3-.525X  GOTOIOPENU 
INTIM1-NU10U  T2-.25X  T3«.275t  GOTOIOPENU 
INTIM-KI+5U  T2-.120X  T3-.15S  GOTOIOPENU 
lF(Tl«>.875)G0TG(LL6)t 

INT  (  M1*M1  +30 )  %  T  2  * .  7  5  X  T3«.  775X  GOTOIOPENU 
lNT(Ml*Mi+35U  T2-.Q75X  T3-.9S 
IF  (T2«<U<r  3)  GOTO  IT  ALL  Y2  UK 

T2*T2t.025t  T3-T3+.025t  IF  I T2»l I Wl TMl Nl .001 IGOTOI TALLY2 ) * 
INT(Ml*Ml  +  m  GOTO  (  OPEN  )  X 
INTI  ,M1  =  , Ml +  l ) X 

COUNT  1  1H)  INIGIGOTOIZIPU  GOTO!  19.0  It 

INTI  PCI* PO+AOUGOTO ILL  1U 

INTI  P^PQ^OU 

IF-INT I C  T*2  > GO  TCI  19.2  It 

GOTO ( 20. G I S 

SET  I  M«U1 1  'i  GOTO  1 17. 1 U 
COUNT  IN) I NIMAX IGOTOI l.OU 
ENTERIZEkOCCI*  SET (2*0) 

T 1  *0X  T2-.025X  SET l  1*0 1 1  INTISUM-OU 
INTI SUM»SUM+01,ZU 

PRINT-FOKMAT IF2)-(Tl)(T210l»Z) SUM  I  SSI 

T  1«T  1  +  ,02 5t  T2-T2+.025*  INCIZ-Z  +  IU 

COUNT (40) INI  I IGOTOI  26.2)  t 

IF- INT  I  Z- 1440 1  GOTO  I  21.0 U  GOTO  I  26. 1U 

INCISS-SS*5U  FSS«FSSi5 

CLEAR!  1440)  NOS. ATI01U 

SET  (  ViAX»0  1 1 

IF-INT  ISS>30)GOTO(N.PROBU 

GOTO!  1.011 

LIST 

END  GOTOISTARTU 
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►»kCu  fc-oe<;  sisTitiBuno.^  skect  foi<  w.mowchan 


IC  A  V  uj,  p,V 


Fl 

F2 

F3 

START 


1.0 

1.1 

2.0 

2.1 


J  «  w 
't  *  C 
*J  •  0 
6»  U 

7*0 

8.0 


Q  t  rtf*  I  .  I  .  w  1.  A  n  »  r  »  r  /  n  i  v  i  TV.  v  .  <  <  <v  i 
t  n  «  f  W  *  V*f  V  f  I  Hki  — 1  HUfW  / 

F0RM(4-5)8-5)B-5) 

FORM  0-10  >3- 70)  \ 

FORM  (9- 10) <5- 10) 4- 10)  4- 10  >4-5  >8-3  )8-5) 

READ-FOft.  .'(Fl)-(SS)R)O) 
REAO-FOltMAT(F2)-l500)NQS.AT(Xl)!S 
CLEArt(40)N05.AT  (Cl  )Z 
CLEAK(40)NUi.AT(TALl)« 

SET  (  1  »0  )  X, 

Tl-01  T2“.025^  SCr(Z-O).. 
IF(TI»<XI»1<T2)l.CT0(3.C)v; 

T l» Ti ♦  .025S-  T2»T?*.0 255- 

IF  IT1  »1.0I  wi  I'HUl  .C01  JGOTOI  3.U  , 

lNCU»l*l)!E  GOTO(2.w)  " 

li.T(Cl,2*Cl,2*;  i 

OOUNt  i  .<  I  )i-  Ivi  l  .  I  ) 

5UM- J  :  5  P  I  (  /  -  >.  i 

1  M T  <  3UV*  iljrtc  lie)  i.Nf  (  T  Al  i  ,2  -  SO! )  A 

C0U  'll'  (40  )  INU  )uv  10  (  o.L)  . 

T 1 *0&  T2».025l  jir ( 1»0U 

Pill  NT-F0RKAT I  F3)-( Tl )  T2)Cl»l>r«LltI>SS)R)DU 
Tl-TU.025*  T2«r2+.025(i 
COUNT  (40  )  IN(  I  JC0T0  ( 8.0X  GOTO  ( START )  X 
END  GO  TO ( START ) % 


A  SIMPLIFIED  TECHNIQUE  FOR  ESTIMATING  DEGREES  OF  FREEDOM 
FOR  A  TWO  POPULATION  T  TEST  WHEN  THE  STANDARD 
DEVIATIONS  ARE  UNKNOWN  AND  NOT  NECESSARILY  EQUAL 


Eugene  Dutoit  and  Robert  Webster 
Quality  Assurance  Directorate,  Ammunition  Reliability  Division 
Mathematics  and  Statistics  Branch, 

Picatinny  Arsenal,  Dover,  New  Jersey 


The  purpose  of  this  paper  is  to  develop  a  practical  aid  for  the  descrip¬ 
tive  statistician  performing  tests  ol  statistical  significance  who  must  do 
most  of  his  computing  at  a  desk  using  an  ordinary  "desk-top"  calculator. 

The  t  statistic  is  used  to  test  for  significant  differences  between  two 
sample  means  when  samples  are  randomly  selected  from  two  normally 
distributed  populations.  If  samples  are  drawn  from  two  normal  popula¬ 
tions  and  the  standard  deviations  of  these  populations  are  unknown  and 
their  computed  estimates  indicate  they  are  not  necessarily  equal,  then 
the  t  test  statistic  is  computed  by: 

X  -  X 

(1)  t'  =  -5-i - \ - yJT 

(sf/n  +  S*/n  )J 1/2 


where  this  random  variable  follows  a  t -distribution  with  degrees  of 
freedom  (d.  f. )  equal  to: 

/■?  *  =n2 


*1  /nll  ,  (S2/n2 


"I'1 


n2  -1 


2  2  2  ,2 

where  S  and  S  are  estimates  of  <r  .  and  <t  respectively  and  n.  and 

1  fa  Cm  1 

n  are  the  sample  sizes.  Since  equation  (2)  is  a  cumbersome  expression 

Cm 

to  work  with,  an  alternate  form  of  this  expression  would  be  desirable  for 
analysis  performed  on  a  desk  calculator  or  a  slide  rule. 
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In  order  to  determine  whether  or  not  tne  standard  deviations  are 
equal  the  F  ratio  test  is  used: 

(3)  F  =  S^/S2 

2  2  2 

where  is  denoted  such  that  >  S^.  This  guarantees  that  F  >  1. 

If  the  computed  F  ratio  is  larger  than  the  tabulated  critical  values  of  F 
ratios,  the  two  standard  deviations  are  unequal.  Equation  {Z\  can  be 
manipulated  so  that  the  d.f.  can  be  expressed  as  a  function  of  the  ratios 
of  the  variances  and  the  values  n  and  n  .  Since  the  ratio  of  the 

i  £ 

variances  has  already  been  computed  as  the  F  test  statistic,  equation 
(2)  can  be  generalized  as; 


(4) 


d.  f.  =  f(F  =  S2  /  S2  ,  n2) 


If  j-  n2  than  equation  (2)  can  be  rewritten  as: 


(5) 


d.f.  = 


(nl  -  1)  (n2  -  1)  (*>2  +  n2F) 
n2  ^n2  "  ^  +  ^nl  '  ^ 


or  alternatively: 


(6) 


d.  f. 


(n2  +  n!F) 


2  2  r2 

2  +  ^ 


nl  -  1 


n2  *L 


Equation  (6)  is  more  efficient  because  twelve  operations  are  needed  to 
calculatt  the  d,  f.  whereas  equation  (5)  requires  17  separate  operations. 
Equation  (5)  however,  has  eliminated  the  "complex  lunction"  appearance 
and  might  be  more  palatable  to  the  statistical  employee  who  would  have 
to  co:  pute  the  value. 
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il? 


If  =  n,  then  equations^)  and  (6)  reduce  to: 


(?) 


d.f.  .  (■  -M 

1  +  F2 


The  derivation  of  equations  (5),  (6)  and  (7)  will  be  presented  in  Appendix 
A.  These  equations  lend  to  comouter  applications  for  selected  values  of 
n, ,  n^  and  F.  The  number  of  degrees  of  freedom  can  then  be  calculated 
and  presented  as  tables  or  graphs.  The  output  format  used  for  the  initial 
computer  run  was  of  the  type: 


Twenty  values  of  F  were  chosen  so  that  twenty  tables  (see  Figure  1) 
were  generated,  These  tables  were  then  used  to  generate  sets  of  curves 
as  per  Figure  2  below: 
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A  fixed  value  for  and  F  produces  a  set  of  smooth  curves  for  various 

values  of  n  .  The  first  attempt  at  plotting  curves  proved  to  be  a  bit 
impractical  for  larger  values  of  n^,  n^  because  the  curves  become  highly 

confounded.  A  more  realistic  plot  can  be  made  (for  Ordnance  purposes) 
by  using  values  of  n^,  n^  <  50. 

Example:  Suppose  we  should  like  to  compare  the  test  scores  of  two 
high  school  science  classes.  We  wish  to  detect  a  significant  difference 
in  the  dispersion  of  scores  within  each  class  and,  in  addition,  we  should 
like  to  detect  whether  or  not  the  average  score  of  one  class  is  significantly 
greater  than  the  other. 

We  consider  the  following  data: 
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Class 

a. 

Class 

_b_. 

95 

81 

69 

80 

83 

67 

74 

77 

46 

81 

91 

92 

71 

85 

90 

86 

76 

52 

82 

78 

64 

86 

71 

82 

82 

79 

72 

84 

80 

80 

84 

88 

91 

56 

64 

S  3 

n2  = 

20 

nb 

= 

16 

X  = 
a 

75.2 

5b 

= 

81.  3 

2 

8  = 
a 

170.06 

2 

8b 

m 

60.10 

s  = 
a 

13.04 

8b 

7.75 

Using  the  F-ratio  test  for  equality  of  variances  (dispersions), 

2 

a 

F  =  -4-  =  2.83  >  2.77  =  F  ,,,'n  -1,  n. -1 

o  g2  a/2'  a  b 

b 

when  a  =  0.  05.  We  conclude  that  a  significant  difference  between  the 
variability  (or  dispersion)  of  test  scores  is  detected. 

Since  we  only  have  estimates  of  the  true  variance  of  the  data  and 
have  shown  these  estimates  to  be  unequal,  we  should  employ  the  two 
population  t-test  for  data  with  unknown  and  unequal  variances  to  deter¬ 
mine  whether  the  average  score  of  Class  b  significantly  exceed  that  of 
Claes  a.  We  must,  therefore,  compute 
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150.  31 
4.75 


=  31.  6  32 


As  might  be  expected,  these  calculations  are  lengthy,  time-comsumii  g 
and  error  prone.  An  alternate  method  to  determine  d.  f .  would  be  to 
consult  the  graphs  which  have  been  prepared  to  yield  values  of  d.f.  when 
the  sample  sizes  and  F-ratio  are  known. 

The  graphs  plot  d.f.  vg  N  for  a  specified  value  of  N  and  for  certain 

C*  i 

values  of  F  .  In  this  case  N,  =  20  and  N,  =  16  since  the  variance  of 
o  2  1 

n  >  variance  of  n.  .  The  simple  steps  to  determine  d.  f .  are  as  follows 

3.  U 

1.  Find  the  graph  corresponding  to  =  16 

2.  On  this  graph  find  =  20. 

3.  On  the  vertical  line  corresponding  to  =  20  find  the  points  of 

intersection  corresponding  to  F-ratio's  of  2.  00  and  3.  00  (only  values  of 

F  =  1,50,  2.00,  3.00,4.00,  5.00,  8.  00  and  14.  00  are  plotted), 
o 

4.  From  these  points  read  d.  f.  (F  =  2.  00)  and  d.  f.  (F  =  3.  00)  off  the 
ordinate 

d.f.  (F=2.  00)  =  33.  5 
d.f.  (F=3.  00)  =  31.  3 

5.  Interpolate  to  determine  d.f.  (F  =  2.83) 

33.  5  -  31.  3  =  2.  2 
(.83)  (2.2)  =  1.8 

33.5-1.8  =  31.7  ~  d.f.  (F  =  2.83)  . 

Thus  d.f.  =  31.7  •*  32  -  which  is  compatible  with  the  calculated  value 

of  31.  6  *  32. 
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(nl  -  1)  (n2  -  1)  (n2  4  rijF)“ 
nZz  (n2  -  1)  +  n2  -  1)  F2 


when  =  n2 


n 


d.  f. 


(1  +  F)  (n 
1  4  F2 


11 


which  is  a  linear  function  for  d.f.  with  "n"  intercept  n  =  1  and  slope 

,  c  *  nz 

1  4  F2 


« ijggUZijjgjK 
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Logic  dlagru  of  computer  progr«JEi 

Figur*  3 
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Ttm  Fortran  atataaant  for  the  tbovt  logio  dingru  la  presented 
balow  aa  Figure  4. 


Figure  4 


DECREES  0F  FREEDOM  F0R  UNEQUAL  VARIANCE  T-TEST 
# 

OIMFNSI0N  SIZEi(20>,SlZEL<20> , P ( 20 > , OF ( 20 , 20 , 20t 
99  RfcAO  INPUT  TAPl  2 ,  100 ,N1 ,N2,N3 
ICQ  FORMAT (312) 

RE  AO  INPUT  TAPE  2  ,  10 1 . 1 S 1 ZES I  It , I • l , N l> 

READ  INPUT  TAPE  2  ,  101 » I S I Z EL ( J » , J «l , N2 > 

Kt AO  INPUT  TAPt  2,101,IFIK),K-l,N3> 

101  F0RMATI 12F6.0) 

00  1000  K-1.N3 
00  1000  J ■ 1 » N2 
Oil  1000  lal,Nl 

UPPER*  I  S IZELI  JKSiZESm»F(M  i»»2 

0EN2H*(S1ZEL(J)*SIZEL(J))/(SIZES( 1 1-1 . )♦ ( F l K) • SIZES! 1 »)««2/ 

1  (  SIZEU  UI-l. ) 

1000  OF ( K,  J,  I )»UPPEH/OEN0M 
00  400  *al,N3 

WRITE  OUTPUT  TAPE  3,  301  ,F  U) ,  I S I2ES II) ,  l  fl  ,N1  > 

301  F0RMAT ( lHl »?X » 98HDEGREES  0F  FREEDOM  F0R  TW0  POPULATION  T-TEST  WITH 
1  UNEQUAL  VARIANCES  WHERE  VARIN2)  EXCEEDS  VARI N1 ) . . // 10X , 3HF  a,F6.2 
2.//8H  X  X,/8H  X  N1  X»/8H  X  X,/8H  X  X, IX » IB ( F4.0 , 2X I) 

WRITE  OUTPUT  TAPE  1,3011 

3011  F0RMATI6H  N2  X  X , /6 X, 2HXX, / IX , 11 31 1HX I , /7X , IHX) 

00  400  J«1,N2 

WRITE  OUTPUT  TAPE  3, 302  ,S  1ZEL  (  J)  ♦  IOFI K,  J ,  1) ,  I -UNI ) 

302  F8RMAT (2X,F4<Q,lX,lMX*18t lX.FS.l)) 

WRITE  OUTPUT  TAPE  3,3022 

3022  FORMAT  I  7 X  » IHX ) 

4C0  CONTINUE 
00  TO  99 

E  NU 1 1 ,0,0,0,0,C,1 ,0,0,1-  ’0,0,0, 0,01 


Tha  oo^mted  output  fellows  tha  format  as  given  in  Figure  5 
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DELETING  OBSERVATIONS  FROM  A  LEAST  SQUARES  SOLUTION 


Charles  A.  Hall 

Technical  Services  Division,  Data  Analysis  Directorate, 
White  Sands  Missile  Range,  New  Mexico 


ABSTRACT .  In  this  paper  we  give  a  matrix  treatment  of  the  classical 
least  squares  theory  and  determine  each  observation's  contribution  to  the 
least  squares  solution.  If  each  observation's  (or  observer's)  contribution 
is  known,  then  it  may  be  possible  to  delete  certain  observations  (or  ob¬ 
servers),  (1)  to  improve  the  least  squares  solution  or  (2)  to  minimize  the 
number  of  observations  (or  observers)  entering  the  least  squares  solution. 

It  should  be  emphasized  that  redundancy  is  necessary  to  obtain  a  statis¬ 
tically  sound  least  squares  solution,  however  it  may  be  advantageously 
limited  without  significantly  changing  the  solution. 

Although  we  present  a  general  least  squares  theory  for  uncorrelated 
observations,  special  emphasis  is  given  to  the  least  squares  missile 
position  problem  generated  by  a  set  of  observed  azimuths,  elevations  and 
slant  ranges  from  a  system  of  missile  tracking  systems  such  as  Radar- 
The  above  treatment  is  used  to  develope  a  geometric  ordering  of  available 
tracking  stations,  which  is  then  combined  with  station  ability  and  reliability 
to  determine  pre -flight  minimal  station  participation.  That  is,  given  an 
approximate  trajectory  and  n  available  tracking  stations  we  predict  the 
minimum  station  combination  for  an  adequate  coverage  of  a  flight  along 
this  trajectory. 

1.  0  INTRODUCTION.  In  this  paper  we  give  a  matrix  treatment  of 
the  classical  least  squares  theory  and  determine  each  observation's  con¬ 
tribution  to  the  least  squares  solution.  If  each  observation's  (or  observer's) 
contribution  is  known,  then  it  may  be  possible  to  delete  certain  observa¬ 
tions  (or  observers),  (1)  to  improve  the  least  squares  solution  or  (2)  to 
minimize  the  number  of  observations  (or  observers)  entering  a  least 
squares  solution.  It  should  be  emphasized  that  redundancy  is  necessary 
to  obtain  a  statistically  sound  least  squares  solution,  however  it  may  be 
advantageously  limited. 

The  following  procedure  has  been  applied  successfully  in  [4,  5,  6] 
to  the  following  problem: 

GIVEN:  An  approximate  missile  trajectory  and  the  co-ordinates 
of  n  tracking  stations  (Cinetheodolite ,  Radars  or  Dovap  receivers)  along 
with  various  other  pre -flight  data; 
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DETERMINE:  The  best  minimal  station  combination  (hew  many?  and 

which  ones?)  for  an  adequate  coverage  of  a  flight  along  this  trajectory. 


We  will  use  the  n- station  radar  position  solution  presented  in  [5]  as 
an  example  of  the  general  theory  which  follows. 


2,  0  LEAST  SQUARES  THEORY.  A  brief  outline  of  a  least  squares 
method  following  the  notation  of  D.  Brown  [l]  will  now  be  given.  The 
model  under  consideration  is  assumed  to  be  non-linear.  There  are  obvious 
simplifications  if  the  model  is  linear. 


Let  (x.]  be  a  set  of  random  variates  (i  =  1,  2 . q) 

(O'  r  s 

(X.  \  be  a  set  of  uncorrelated  observations  of  the  set  :X,  \  , 

b  l  -  -  i  • 

For  example;  ^A°,  }  .  the  set  of  azimuth,  elevation  and  range 

readings  from  a  system  of  n  radar  stations  to  a  missile  (1  =  1,  2 . n). 


Let 


Y.  = 
J 


be  a  set  of  variates  (parameters)  dependent  on  the  X^ 

Y.  (Xj,  X2 . Xq)  ,  (j  =1,  2 . p)  . 


We  note' that  the  explicit  form  the  for  Y^  as  functions  of  the  X^  may 

not  exist,  in  which  case  only  an  implicit  form  for  this  dependence  is 
available. 


For  example: 


(x,  y,  *),  the  missile  co-ordinates  are  dependent  on 


If  the  set  ^X.  j  is  such  that  not  all  the  are  necessary  to  deter¬ 
mine  the  entire  set  of  $  X^ ;  ,  or  what  is  of  more  importance  here  and 

in  [S]  ,  to  determine  the  derived  set  (y  ^  ,  then  the  set  \_X,]  is  said 


to  be  over-determined.  A  least  squares  solution  is  in  order.  We  need 
to  find  (  Y1  |  a  set  of  approximations  to  jY^j  such  that  the  sum  of  the 

squares  of  the  residuals  of  the  observed  set  [x°  ]  is  a  minimum. 


For  example:  In  the  n-statlon  radar  case  [b,  ,  each  radar  determines  a 
missile  position  (x(j),  y(j),  z(j)),  (j  =1,  2,  ■  .  .  ,  n).  These  points  will 
coincide  with  probability  zero.  We  use  the  least  squares  method  below  to 
determine  the  "true11  missile  position. 
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We  have 

1 

X. 

i 

■  x°  *  n 

*-*• 

11 

ro 

.  q) 

i 

Y. 

J 

=  Y°  +  6. 

J  J 

(j  =  1,  2,  ... 

.  p) 

where  (y° ~]  is  a  first  approximation  to  .  fx^]  and  {Yj}  are 

least  squares  approximations  and  the  y ^  and  6 ^  are  undetermined 
residuals. 

Suppose  the  minimum  number  of  {.X^)  required  to  determine 

the  entire  set  of  I X  \  is  q  ,  then  the  number  of  independent  condi- 
*■  1  >  o 

tional  equations  relating  the  { X^  j  and  \Y  ]  iB  m  =  (q  -  qQ)  +  p.  Let 
these  m  equations  be  given  by 


(2.1) 


M*l’  •**  Xq*  Y 


1' 


.  ,  Y  )  =  0  (i  =  1,  2,  ..  .  ,  m). 


For  example;  In  the  radar  case  if  3  observations  are  known  (azimuth, 
elevation  and  range  readings  from  one  station)  then  the  others  can  be 
determined,  thus  m  =  (3m  -  3)  +  3  *  3m.  In  this  example 


f 3i-2  =  Ai  '  Tan 


-I 


y  -  y4 


X  -  X. 
1 


=  0 


f  =  E  -  Tan 
*3i-l  i 


-1 


z  -  z. 


=  0 


3i 


V  V  (x'xi)2  +  (y-y*)2  +  (*-*/  =  0 


(i  =1,  2 . n).  Note  that  here  (x^  y.,  z^)  are  the  co-ordinates  of 

of  the  i^  radar  station. 
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Assume  that  the  f^  can  be  expanded  in  a  Taylor  series  about  the 

point  t  =  (X°,  X°,  ...  ,  X°,  Y®,  ...  ,  Y°).  Approximate  the  f.  by  the 
*  *  n  i  p  i 

constant  and  linear  terms  of  these  Taylor  expansions  and  replace 

X^  by  X°  +  y^.  Equation  (2, 1)  becomes  (in  matrix  notation) 

(2.2)  AV  +  BD  +  E  =  0  where 

A  is  the  m  by  q  matrix  (A^)  with  A  =  [  9^/aXj]  (t), 

B  is  the  m  by  p  matrix  (B^)  with  B,^  =  [Sf^/dY^]  (t), 

E  is  the  m  by  1  matrix  E  with  E.  =  f.(t) 

V  =  (>V  Vg*  •  •  •  *  Vq)4  and  D  =  (  61(  6 2 .  .  .  .  ,  6  )*. 


For  example:  Note  that  A  =  1  in  the  Radar  and  Cinetheodolite  cases, 
and  A  is  a  scalar  matrix  in  the  Dovap  case. 


Assuming  uncorrelated  observations,  the  least  squares  solution  is 
that  which  results  in  minimizing  the  weighted  sum  of  the  squares  of  the 
residuals 


(2.3) 


S  =  V^flr)  where 


o  > 

(<r)  is  the  relative  variance  matrix  of  the  observations  |  X.  f  .  The 

•1  th  v  * 

element  (<r)^  =  is  the  weight  of  the  i  observation. 

F or  example:  In  the  radar  case  the  weight  (o’).'!  =  W.  can  be  determined 
as  follows.  Compute  ^  ^ 

Z  x(i) 

=  id__ 

j  n  -  1 


E  y(l) 


Va 


(j  =  1.  2, 
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Compute  the  back  azimuth: 


Let:  Wr  ,  =  1  /  (A.  -  A°)2 

-  3j-2  J  j 

W3j-1  *  1  /  < V  Ei°)2 

W3j  =  i/  (Rj  -  R®)2  ,  (j  =  1,  2 . n). 

In  the  terminology  of  matrix  algebra  the  problem  of  least  squares  as 
considered  by  Brown  [l,  2]  and  Hall  [4,  5,6]  consists  of  determining  of 
all  possible  vectors  V  and  D  satisfying  (2.  2),  those  which  minimize  (2.  3). 

We  solve  the  constrained  minima  problem  with  the  aid  of  Lagrange 
multipliers.  Let  \  =  {\y  A2,  ,  A  )*,  from  (2.  2)  and  (2.  3)  we  have 

(2.4)  S  =  V*  (tr)’1  V  -  2X*  (AV  +  BD  +  E). 

To  determine  the  minimum  value  of  S,  equate  to  zero  the  partial 
derivatives  of  S  with  respect  to  the  y^  and  6^. 

Differentiation  of  S  with  respect  to  the  residuals  y^  yields 
(v)'1  V  -  A\  =  0  or  V  =  (or)  A*\  . 


(2.5) 
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Differentiation  of  S  with  respect  to  the  residuals  6  yields 


(2.6) 


B*A  =  0  . 


Substitution  of  (2.  5)  into  (2.  2)  yields 
(2. 7)  (A  (<r)  At)X  +  BD  +  E  =  0  . 


If  (A  («•)  A*)  is  nonsingular  then  the  least  squares  solution  results 
from  (1. )  Solve  (2.  7)  for  A  *  -  (A  (<r)  A*)"  (BD  +  E) 

(2.  )  Substitute  A  into  (2.  6)  and  derive  the  Reduced  Normal  Equation 


(2. 8)  ND  +  C  =  0  where 

N  =  B*  (A  (<r)  A*)"1  B  and  C  *  B*  (A  (<r)  A*)"1  E. 

(3.  )  Solve  (2.  8)  for  D. 

(4. )  Solve  (2.  5)  for  V. 

In  most  cases  the  matrix  A  (<r)  At  is  nonsingular  and  (2.  8)  is  valid. 
In  the  few  cases  where  this  is  not  true,  it  is  possible  to  remove  the 
difficulty  by  manipulating  the  conditional  equations,  [2]  . 


We  have  computed  a  least  squares  approximation  to  the  parameters 
using  an  initial  approximation.  We  now  repeat  this  procedure  using 

instead  of  as  an  approximation  and  compute  a  new 

residual  matrix  D.  The  iteration  continues  until  |  |  D  |  |  is  sufficiently 
small. 


Since  we  want  to  delete  observations  (or  observers),  we  need  some 
basis  fox  determining  which  observations  are  the  most  likely  candidates 

®( 6 1 »  6  ^ ,  ...  »  6  _) 

for  deletion.  We  use  the  partial  derivatives 


k2’ 


•  •  .  X J 


•"•vi-  - — ’■ 


evaluated  at  t  to  aid  in  this  determination. 


,  --  ***•_  -.cn**.  V*a3a&J* 
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-  ~  «.*•*«*<»  A  mtrMkT  /~\TT  r\  A  r*  +  in  th+  introduction  there 

0,0  vni.Aw*i  **“  r - -  ■ 

are  two  distinct  motives  for  deleting  observations.  In  general  if  we  are 
trying 

(a.)  TO  IMPROVE  THE  SOLUTION 


WANT:  86  /  8X°  small,  so  that  errors  in  X°  will  have  little 


effect  on  5^. 

DELETE:  86^  /  8X°  large,  since  a  small  error  in  will 

result  in  a  large  error  in  the  6j. 

(b. )  TO  MINIMIZE  PARTICIPATION 

WANT;  86  /  8X°  large,  since  this  observation  (X  has  a 
great  effect  on  the  solution. 

DELETE:  86,  /  8X°  small,  since  this  observation  (X  has 

J 

little  effect  on  the  solution, 

Let  U  =  (X . X  )  and  define  the  p  by  q  matrix 


DT1  =  [8/8U]  [D]  = 


— ”1  I®.,  6  ,,  ...  ,  6  ]  .... 

1  2  p  nv°  AX 


where 


86 

—  (0  • 
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One  of  the  objectives  of  this  paper  is  the  derivation  of  D^,  Note  that 
rate  c^anBe  of  6^  (the  correction  in  the  dependent 
variable  Y^)  with  respect  to  the  observation  X°. 


For  example: 


In  the  radar  case 


is  the  rate  of  change  of  the 


correction  in  one  of  the  missile  position  co-ordinates  with  respect  to  a 
change  in  aslmuth,  elevation  or  range  at  the  station. 


From  (2.  8)  we  have 

D  =  -N'1  C  =  -  [Bt(A(<r)  A1)"1  B]  "1  [B  (A  (<r)  A1)"1]  £. 


Since  observational  errors  have  no  significant  effect  on  the  matrices 
A,  B  and  (o') ,  they  may  be  regarded  as  constants  in  the  propagation  of 
error  under  consideration.  The  vector  E  however  is  affected  by  the 
observational  errors.  Thus  the  error  in  D  arises  primarily  from  errors 
in  E,  which  in  turn  are  caused  by  errors  in  the  observational  vector  U. 
Therefore 

Dy  *  -  N_1  R  Ey  where  R  =  Bt  (A  (<r)  A*)'1  and  Ey  =  [-£-]  [E]  . 


But  Ey  =  A  and  thus 

(3.1)  Dy  =  -  N’*  R  A. 

Note  the  simplification  if  A  =  I,  as  is  the  case  in  [4,5]  . 

4.0  VARIANCE  -  COVARIANCE  MATRIX.  A  well  known,  [2.  7]  , 
generalized  law  of  covariance  (in  matrix  notation)  states  that  if 

D  =  (6  ,  ...  ,  6  )  is  a  vector  of  functions  of  the  elements  of  the  vector 
1  P 

U  =  (X*?,  X°,  ...  ,  X°)  which  has  the  variance  matrix  (<r),  then  the 

i  ^  CJ  0 

variance -covariance  matrix  of  the  vector  D  is  given  by 

(<r  )  =  D  (<r)  Dt  . 

'  TV  n  '  '  IT 


'•mater**** 


.  -L+i-.-z-m r.«- ■«■*-•  z**mv*t hu* sjh*» 
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(4.1) 


^trD)  =  ffo  N'L 


Note  that  <r^  is  the  population  variance  and  (<r)  it  a  relative  variance  matrix 
of  the  observations. 


In  the  radar  case  or 


n 

Z  ,  2  ,2  .  2  . 

2  1=1  <*ilwil  +  *12  Wi2  +  *13  W13> 


3n  -  3 


5.0  VARIABILITY  ESTIMATE.  For  each  correction  6,  of  the  derived 
. —  -  -  ■  i 

quantities  Y^.  a  "variability  estimate"  will  now  be  associated  with  each 
observation. 

In  the  radar  case,  for  each  co-ordinate  residual  a  variability  estimate  is 
associated  with  each  tracking  station. 

l/2 

Consider  the  matrix  H*  tr  DT,  (<r)  '  .  Note  that 

o  U  '  ' 

<r  96 

Hij  =~  —S'  <i  =  1*  2 . Pi  j  =1.  2,  .. 

Wj  9X  ° 


q)  . 


and 


HH”^)  . 


It  follows  that  the  variance  in  the  derived  quantity  Y. 


(5.1) 


E  H 


i  J=1 


ij 


q  *c 
E  — 

j=l  w 


96 

9X 


(1-1.2 . p). 


Z  th 

Since  H  is  the  j  observation's  contribution  to  the  variance  in  Y., 

*3  2  th  * 

we  will  refer  to  as  the  "avriability  estimate"  in  6^  for  the  j 

observation,  (i  =  1,  2 . p  ;  j  =  1,  2 . q). 
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In  the  radar  cue  there  are  three  observations  per  station  ( azimuth, 
elevation  and  range)  end  thu«  the  variability  estimate  ’’for  the  station1 
is  defined  a»  the  sum"of  the  variability  estimates  (as  defined  above)  for 
the  aaimuth,  elevation  and  range  readings  at  the  j***  station.  We  are 
interested  in  eliminating  etation »  and  thue  observations  three  at  a  time. 


'U 


Hi,3j-2  + 


th 


H 


i,  3J-1 


+  H 


i.3j 


i»  the  variability  estimate  for  the  j  station,  where 


‘3j-2 


=  A 


j*  3j-i  “ 


=  E-  and  X3j  - 


(j  =1.  2- 


3n). 


6.  0  MOTIVES  FOR  DELETING  OBSERVATIONS.  We  will  now 
discuss  motives  or  reasons  why  one  might  want  to  delete  observations 
before  computing  a  least  squares  solution. 


6.1  TO  IMPROVE  LEAST  SQUARES  SOLUTION.  In  this  case  we 
are  interested  in  deleting  observations  which  are  "extremely"  poor, 
that  is,  observations  which  contribute  greatly  to  the  variances.  Certainly 
if  all  of  the  (j  =  1,  2,  ...  ,  q)  are  relatively  close  to  being  equal  then 

no  observation  is  predominately  worse  than  the  others  and  no  observation 
should  be  deleted  as  a  result  of  investigating  the  variability  estimates.  One 
should  remember  that  usually  the  variances  increase  with  a  decrease  in 
observations.  However,  if  one  (or  more)  observation's  variability 
estimate  is  quite  large  in  comparieon  to  the  others,  then  this  observation 
would  be  considered  a  predominate  contributor  to  the  variances  (or 

least  squares  solution)  and  would  definitely  be  a  candidate  for  deletion, 
One  must  consider  an  observation's  contribution  to  each  variance 
2 

^Yi  *  =  ' '  *  •  P)  wben  deciding  if  an  observation  should  be  deleted. 


There  are  various  ways  one  might  want  to  combine  these  contributions 
to  the  variances  so  as  to  be  able  to  order  the  observations  (or 


Yi 


2  2 


observers).  In  the  radar  case  we  have  three  variances  <r  ,  <r  ,  v 
(p=3)  to  consider  and  define  station  constants  r 
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D.  = 
J 


V 


c2  +  c2 

1J  C2j 


4  C 


3J 


0  =  1.  2, 


-  n) 


The  stations  are  then  ordered  according  to  the  magnitude  of  their  station 


constants. 


I®! 


n 


To  improve  a  least  squares  solution  the  station  corresponding  to 
the  largest  station  constant  is  designated  the  most  likely  to  be  deleted. 

This  case  of  improving  solution,  not  being  our  main  motive  for  the 
study,  has  not  yet  been  thoroughly  investigated. 

6.2  TO  MINIMIZE  THE  NUMBER  OF  OBSERVATIONS.  In  this 
case  we  are  not  primarily  interested  in  an  improved  solution,  but 
rather  deleting  observations  which  contribute  "very  little"  to  the  solution, 
so  as  to  minimize  the  data  that  we  must  consider  for  a  solution.  The 
observations  (or  observers)  that  contribute  least  to  the  variances  of  those 
with  the  smallest  variability  estimates  are  the  most  likely  candidates  for 
deletion.  Our  motive  here  might  be  completely  logistical, 

In  the  radar  case,  it  should  be  pointed  out  that  the  matrices  needed 
to  obtain  the  ordering  of  stations  given  above  (D.  S  D.  5  . .  .  >  D  ) 

\  h  ~  V 

can  be  determined  (or  at  least  approximated)  before  flight.  To  find  the 
variability  estimates  we  need  to  know: 


(1)  B-(by) 


9(X,  Y,  Z) 


8(Aj(E  ,R  ) 


-(t)  ,  This  matrix  is  readily  computed 


given  station  co-ordinates  and  an  approximate  missile  position. 

(2)  (<r)  =  dg  (e^,  ’  °*3n  3n)  =  variance -covariance  matrix 

of  the  observation  variables.  If  the  standard  deviations  <r  .  ,  <r_,,  <t 

=  Aj  Ej  Rj 

(j  =  1,  2,  .  .  ,  ,  n)  are  known  from  past  histories  then  set: 


460 


Design  of  Experiments 


<'>3J-2(3j-2  =  ^Aj  ^  C°B  Ej/<ro 


(<r)3j-l,3j-l  =  *Ej  J  *o 


(<0 


3J,3j 


=  *Rj  /  'o 


where  <r 


Z  ,  2  x  2  x  2  ,  _2. 

.  ,  (<r « .  +  ff_.  +  o"_ ,  /  R,  ) 

j=P  Aj  Ej  Rj  '  j 

3n 


( j  =  1 ,  2 ,  .  .  .  ,  n) 


In  the  Cine  theodolite  Study,  DR-Q  has  estimates  of  o- .  and  <r_ .  and 

AJ  J1.J 

plans  are  being  made  to  keep  records  for  the  Radar  and  Dovap  systems. 

If  the  standard  deviations  are  not  available,  then  the  present  weighting 
scheme  at  WSMR  may  be  used  setting 

<'>33.2,33-2  =  l/R3Co82Ej 


<*>33-1,33-1  =  1//R3 


W 


33.33 


=  1 


(j  =1,  2,  .  ..  ,  n). 


In  this  latter  case  an  approximation  of  r  is  used  Instead  of  the  above 

calculated  values,  (If  neither  of  these  weighting  schemes  are  acceptable, 
then  one  can  simply  set  (r  )  =  I, ) 


(3.)  Dy  =  -  [B(  (o’)"1  B]  [B*  (o’)'1]  since  A  =  I. 

.1/* 


(4.)  H  =  (r^Dy  (*■)  '  1  and  thus  the  variability  estimates  and  station 
constants  are  available  before  flight. 
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inub  be  lure  flight  wc  c  an  uiuci  the  sUtiCHC  gCC: 
"observational"  errors,  with  the  standard  deviations 
in  a  simulated  least  squares  solution. 


-on. 


Ksr  -inrli 
“  /  * — 

and  <r 


Rj* 


It  should  be  pointed  out  here  that  this  ordering  determines  the  best 
k  station  combination  (k  sn)  as  the  stations  (i^,  i^,  ...  ,  i^).  Otherwise 

ni 

one  would  have  to  consider  ^ ^  ^ t  possible  combinations  of  k 

station  solutions  to  arrive  at  this  stage. 

In  the  final  stage  the  Minimal  Station  Participation  problem  [4,5,6] 
takes  the  form: 


GIVEN;  (1)  A  geometric  ordering  of  n  stations  (D.  2 


>  D.  )  , 


(2)  A  reliability  factor  P,  for  each  station  -  the  probability 
of  successful  operation  if  scheduled,  J 


(3)  Data  precision  factors  for  each  variable  (A,E,R)  per 
station  -  sAj.  fEj.  rRj. 

(4)  Necessary  data  to  determine  tracking  capabilities  such 

as  tracking  rates  (focal  lengths  and  object  size  in  the  case  of  Cinetheodolite) , 
etc. 


FIND;  A  subsystem  of  k  stations  (k  f  n) ,  k  a  minimum,  such  that 
for  this  particular  point  and  missile  we  have: 

(1. )  Each  station  in  the  subsystem  is  able  to  track, 

(2. )  The  probability  of  two  or  more  (three  or  more  in 
Cinetheodolite  case)  of  the  k  stations  will  operate  successfully  is  greater 
than  P, 


(3.)  The  geometric  ordering  given  above  is  such  that  the 
stations  deleted  are  insignificant  contributors  to  the  solution. 

Thus  we  consider  station  ability,  reliability  and  geometry  in  deter¬ 
mining  the  Minimal  Station  Participation  Before  Flight  (MSPARB)  System. 


D  coign  u!  ZlA.jJe  til  II 


The  RADAR  and  DOVAP  programs  are  in  the  process  ot  being  written. 
Consider  the  following  SLIDE  of  the  MSPARB  Cinetheodolite  program  [4]  , 
of  13  August  1965. 


The  input  includes 


(1)  (x.,  yy  z^) 


(j  =  1,  2,  ...  ,  n),  WSCS  co  -ordinates 
of  the  station, 


(2)  (x,  y,  z)  ---------  u.n  approximate  missile  position, 

(3)  (x,  y,  z)  — - approximate  velocity  components, 

(4)  (x.  y,  z)  ---------  approximate  acceleration  components, 

(5)  <r  - (j  x  1,  2,  ...  ,  n),  the  standard  deviation 

j  in  azimuth  readings  at  the  j’^  station, 

(b)  o-  - . . (j  x  1,  2,  ...  ,  n),  the  standard  deviation 

j  in  elevation  readings  at  the  station, 


(j  =  1,  2,  ,  .  ,  ,  n),  the  angular  velocity 
limit  in  azimuth  for  the  y™  station, 


(8) 


(j  =  1,  2,  ...  ,  n),  the  angular  accelera¬ 
tion  limit  in  azimuth  for  the  j*h  station, 


(9) 


(j  =  1,  2,  ...  ,  n),  the  angular  velocity 
limit  in  elevation  for  the  j**1  station, 


(10) 


(j  =  1,  2,  ...  ,  n),  the  angular  accelera¬ 
tion  limit  in  elevation  for  the  j**1  station, 


(11)  F. 


(j  =  1,  2,  ...  ,  n),  elective  focal  length 
of  the  jlk  camera, 


(12)  0 


object  size 


(13)  P. 
'  J 


(j  =  1,  2,  ...  ,  n),  the  probability  that 
station  j  will  operate  successfully  if 
scheduled. 
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Notice  that  the  criterion  for  deletion  of  stations  contains  three  main 
considerations: 

I.  STATION  ABILITY.  All  station!  considered  will  first  be  tested 
as  to  inability  to  track  for  a  certain  interval  for  one  or  all  of  the  following 
reasons; 


(1)  Image  size  too  small, 

(2)  Tracking  rates  too  large, 

(3)  Elevation  angle  too  small. 

II.  STATION  RELIABILITY.  The  minimum  number  of  stations  is 
chosen  so  that  the  probability  of  three  or  more  stations  operating 
successfully  at  any  one  time  is  greater  than  a  pre-determined  number. 

III.  STATION  GEOMETRY,  The  stations  are  ordered  according  to 
station  geometry.  Stations  are  deleted  if  their  geometric  contributions 
are  "insignificant". 

Program  output  includes; 

(1)  Print  out  of  all  or  part  of  input  to  program, 

(2)  Computed  azimuth  and  elevation  angles  from  each  station  to 
the  point  under  consideration, 

(3)  Computed  approximations  to  expected  standard  deviations  in 
missile  co-ordinates  and  angular  standard  deviation, 

(4)  Geometric  ordering  of  stations  to  include  station  numbers 
and  geometric  factors, 

(5)  The  probability  that  three  or  more  of  the  stations  in  MSP  ARB 
will  operate  successfully  if  scheduled. 

Modifications  of  the  above  MSPARB  Cinetheodolite  program  since  13  Aag  65 
include  (1)  a  print  out  of  error  estimates  for  the  system  of  the  worst  three 
stations  in  MSPARB  as  well  as  error  estimates  for  MSPARB,  (2)  a  print 
out  of  cumulative  error  estimates  over  the  entire  trajectory..  (3)  a  print  out 


ala 
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of  how  many  times  a  station  was  used  over  the  entire  trajectory.  (I  have 
available  here  sample  print  outs  for  a  few  trajectories  if  anyone  ia  interested.  ) 

Areas  where  MSPARB  can  be  used  include: 

(1)  Schedule  determination. 

(2)  Minimizing  the  current  scheduling  efforts, 

(3)  Determination  of  best  launch  point  (balloons), 

(4)  Determination  of  best  positioning  of  mobile  units. 

(5)  Determination  of  best  positioning  of  future  station  sites. 

.  ( 6)  Statement  of  expected  system  (MSPARB)  errors  -  (confidence 
interval)  •  pro- flight. 

(7)  Determination  of  which  system  (Cinetheodolites,  Radar,  or 
Dovap)  or  combination  of  systems  will  yield  the  best  trajectory  coverage- 
BET,  • 


(8)  Pure  error  studies  concerning  geometry  versus  data  precision. 

Let  us  close  by  stating  again  that  redundancy  is  necessary  to  obtain  a 
statistically  sound  least  squares  solution,  however,  through  the  methods 
outlined  here  it  can  very  definitely  be  advantageously  limited. 
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PRECISION  AND  BIAS  ESTIMATES  FOR  DATA  FROM 
CINETHEODOLITE  AND  AN/FPS-16  RADAR 
TRAJECTORY  MEASURING  SYSTEMS 

i 

Oliver  L.  Kingsley  and  Burton  L.  Williams 
Range  Instrumentation  Systems  Office 
White  Sands  Missile  Range,  New  Mexico 


INT RQDUCTION,  A  series  of  flight  tests  have  been  conducted  at 
White  Sands  Missile  Range  in  an  effort  to  obtain  a  comparison  of  trajectory 
data  derived  from  the  measurements  produced  by  different  instrumentation 
systems.  The  instrumentation  systems  that  have  been  used  in  some  of 
these  tests  are  Ballistic  Camera,  DOVAP,  Cinetheodolite,  and  FPS-16 
Radars.  Interim  reports  were  prepared,  based  on  the  data  from  the  three 
earlier  flights  conducted  on  March  29,  I960,  September  19,  I960,  and 
January  29,  1962.  Mr.  Kingsley  and  Mr.  Free  presented  a  summary  of 
the  analysis  and  results  of  these  earlier  flights  at  the  sixth,  seventh  and 
ninth  annual  meetings  of  this  conference. 

Purpose  of  Report 

The  fourth  flight  test  was  conducted  on  October  1,  1962  using  a  modified 
Nike  Hercules  Missile.  The  purpose  of  this  report  is  to  present  an  analysis 
of  the  bias  and  random  error  associated  with  some  of  the  major  range 
instrumentation  systems  used  lor  this  flight  and  to  compare  this  data  with 
the  data  from  the  earlier  flight  tests. 

Comparability  of  Results  and  Earlier  Flight  Tests 

The  precision  estimates  are  directly  comparable  but  the  bias  estimates 
are  not,  because  the  comparison  with  trajectory  data  from  the  Ballistic 
Camera  System  was  not  available, 

The  earlier  three  flight  tests  were  conducted  at  night  so  the  Ballistic 
Camera  System  could  be  utilized  to  obtain  trajectory  data  to  be  used  as 
a  standard  for  position  bias  error  estimation.  The  Ballistic  Camera,  used 
on  earlier  tests,  photographed  a  flashing  light  beacon  on-board  the  missile 
against  a  star  trail  background.  The  light  beacon  flashes  were  controlled 
from  the  ground  by  a  trasponder  aboard  the  missile. 
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Fourth  Flight  Test 

Th«  fourth  flight  test  was  conducted  during  the  daylight  hours  utilizing 
two  cinethsodolite  systems  and  seven  AN/FPS-16  radar  systems,  though 
only  two  of  the  radar  systems  are  analyzed  here.  The  Askania  Cinetheo- 
dolite  System  was  used  as  the  reference  standard  for  system  position  bias 
error  estimation  for  the  Contraves  cinetheodollte  and  FPS-16  radar  systems. 
No  DOVAP  or  Ballistic  Camera  systems  were  used  for  this  fourth  flight 
test.  The  AN/FPS-16  radar  systems  were  operated  successfully  in  the 
beacon  tracking  mode  for  the  first  time  during  this  fourth  test  of  the  series. 
Attempts  were  made  to  use  the  FPS-lfc  radar  systems  in  the  beacon  track¬ 
ing  mode  for  the  three  earlier  flight  tests,  but  the  on-board  beacon  did  not 
operate  properly. 

Position,  Velocity  and  Acceleration,  Precision  and  Bias 

In  addition  to  the  estimates  of  bias  and  precision  for  the  position  data, 
as  given  in  the  earlier  reports,  estimates  of  the  bias  and  precision  given 
for  the  derived  velocity,  acceleration  and  smoothed  trajectory  position 
data  are  presented.  These  fourth  flight  test  estimates  of  bias  for  position, 
velocity  and  acceleration  are  based  on  data  taken  from  the  Askania  cinetheo- 
dolite  system. 

PRECISION  ESTIMATES  FOR  TRAJECTORY  DATA. 


Standard  Deviation  Estimate 

Precision  estimates  were  derived  from  trajectory  data  obtained  from 
two  cinetheodollte  systems  and  two  AN/FPS-16  radar  systems  in  terms 
of  standard  deviations  for  the  Cartesian  component  trajectory  data.  The 
standard  deviation  estimates  were  derived  by  the  multi-instrument 
components  of  variance  technique  as  given  by  Simon  and  Grubbs.  [  1,2] 

Instrument  Reduction  for  Position 


The  cinetheodollte  trajectory  position  data  were  derived  from  a  least 
squares  reduction  of  angular  measurements  [3]  ,  The  Askania  cinetheo¬ 
dollte  system  was  a  five  instrument  system  making  ten  angular  observations 
for  each  trajectory  space  point;  the  Contraves  cinetheodollte  system  was 
a  three  instrument  system  for  trajectory  section  one  and  &  two  instrument 
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system  for  trajectory  section  two,  making  six  and  four  angular  observa¬ 
tions  respectively  for  each  trajectory  space  point.  The  radar  trajectory 
position  data  were  derived  from  the  range,  azimuth,  and  elevation  obser¬ 
vations  that  were  reduced  to  the  Cartesian  coordinate  system. 

Mathematical  Model 

A  mathematical  model  for  the  trajectory  position  data  from  the  jth  instru¬ 
ment  system  at  the  ith  time  may  be  written:  X. .  =  X.  (true)  +  <v  . ,  where 

represents  a  composite  random  error  for  the  jth  instrument  syBtem 

at  the  ith  time.  Standard  deviation  estimates  were  determined  for  these 
position  data,  and  also  for  sets  of  smoothed  position,  velocity,  and 
acceleration  data  that  were  derived  by  fitting  a  set  of  component  position 
data  to  an  eleven  point  second  degree  polynomial  in  time,  and  evaluating 
at  the  midpoint  for  successive  trajectory  space  points  (50  per  trajectory 
section).  The  polynomial  equation  for  the  smoothed  X-component  data  for 
the  ith  time  would  be  of  the  form: 


(1) 


X,,  (smoothed)  =  a  .  +  a..t,  +  a_.t,  1 
ij  oj  Ij  i  2j  i 


for  the  jth  instrumentation  system,  An  error  would  generally  be  associated 
with  each  of  the  coefficients  for  the  jth  instrumentation  system.  A 
composite  random  error  for  the  jth  system  can  be  expressed  in  the 
mathematical  model: 


(2) 


X^  (smoothed)  =  X,  (true)  +  e.^ 


where 

The  velocity  equation  is  written: 


e  Is  the  composite  random  error  for  the  jth  system  at  the  ith  time, 
cit 


(3) 


X.  =  a  +  2a  t 
ij  ij  2j  i 


The  composite  random  error  for  the  velocity  data  can  be  expressed  by 
the  velocity  equation: 


l, ,  -  X.  (true)  +  e 
ij  l 


ij 


(4) 
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where  the  composite  random  error  in  velocity  (e. .)  arises  in  the  two  of 

the  terms  of  the  velocity  equation.  A  similar  pair  of  equations  could  be 
written  for  the  derived  acceleration  data. 

Discussion  of  Precision  Estimates 


The  position  standard  deviation  estimates  presented  in  Table  1  represent 
essentially  random  error  in  position  data  from  the  particular  system.  The 
standard  deviation  estimates  range  from  two  to  twenty-two  feet  with  the 
exception  of  trajectory  section  two  for  the  Contraves  system  where  the 
system  geometry  is  very  poor.  Generally,  this  would  not  be  considered 
satisfactory  coverage;  it  is  included  for  the  sake  of  continuity. 

The  position,  velocity,  and  acceleration  standard  deviation  estimates 
presented  in  Tables  II,  III,  IV,  and  V  represent  the  residual  random  error 
in  the  derived  (or  smoothed)  position,  derived  velocity,  and  derived 
acceleration  data  respectively.  The  velocity  standard  deviations  for  the 
cinetheodolite  data  ranged  from  two  feet  per  second  to  eleven  feet  per 
second  except  for  the  second  trajectory  section  for  the  Contraves  cinetheo- 
dplite,  The  velocity  standard  deviations  for  the  radar  data  ranged  from 
three  feet  per  second  to  sixteen  feet  per  second.  Velocity  data  derived 
from  the  radar  observations  is  as  good  bb  the  velocity  data  derived  from 
the  cinetheodolite s  with  respect  to  variability.  The  cinetheodolite  systems 
and  the  radar  systems  are  essentially  equivalent  m  variability  with  respect 
to  the  acceleration  data;  the  only  exceptional  values  are  the  two  large 
acceleration  standard  deviations  due  to  the  poor  system  geometry  for  the 
Contraves  system. 

BIAS  ESTIMATES  FOR  TRAJECTORY  DATA, 

Standards  Used  In  Computation. 

All  of  the  bias  estimates  for  Flight  Test  Nr.  4  of  the  Operation 
Precise  Program  are  based  on  trajectory  data  from  the  Askania  cinetheo¬ 
dolite  system  with  a  mode  of  ten  angular  measurements.  Earlier  flight 
tests  have  used  trajectory  data  from  the  Ballistic  Camera  System  which 
was  based  on  star  trail  background  for  calibration.  The  Askania  system 
does  very  well  in  the  determination  of  the  horizontal  trajectory  position 
points  but  has  some  bias  in  the  vertical  determination  as  indicated  by 
earlier  flight  tests  [9,  U,  > 2  1  . 
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Definition  of  Bias  Errors  and  Discussion 


The  average  bias  estimates  tor  position,  velocity  and  acceleration 
are  presented  in  Tables  VI,  VII,  and  VIII  for  the  respective  Contraves, 
Radar  112  and  Radar  122  systems.  A  positive  average  bias  means  that  the 
particular  system  trajectory  data  was  on  the  average  greater  than  that 
corresponding  data  from  the  Askania  system. 

The  average  absolute  component  position  bias  estimates  ranged  from 
a  low  of  six  feet  to  a  high  eighty-two  feet.  The  velocity  and  acceleration 
average  bias  estimates  were  low,  The  largest  velocity  component  bias 
was  four  feet  per  second;  the  largest  acceleration  component  bias  was 
only  seven  feet  per  second  squared.  The  explanation  for  the  large  average 
position  bias  error  and  the  much  smaller  average  velocity  and  acceleration 
bias  error  is  that  the  trajectories  aB  determined  by  the  instrumentation 
systems  are  parallel  but  differ  by  a  constant  amount  in  position.  This 
means  that  the  least  squares  fitting  differ  by  essentially  the  constant  term 
of  the  second  degree  polynomial  in  time, 

A  comparison  of  the  unsmoothed  position  data  from  the  Contraves  and 
radar  systems  with  the  corresponding  data  from  the  Askania  system  reveals 
that  the  average  bias  does  not  differ  from  the  corresponding  bias  estimates 
shown  in  Tables  VI,  VII,  and  VIII  by  more  than  one  foot.  This  indicates 
that  the  smoothing  process  either  moves  the  average  bias  estimate  the  same 
amount  for  all  systems  or  that  smoothing  does  not  change  the  bias,  A 
further  Btudy  of  the  smoothed  and  unsmoothed  trajectory  data  from  the 
Askania  system  reveals  that  the  smoothing  process  leaves  the  Askania 
trajectory  data  essentially  unchanged. 

SOME  COMPARISONS  OF  PRECISION  ESTIMATES  WITH  EARLIER 
FLIGHT  TESTS.  Comparison  of  earlier  flight  tests  were  possible  for  the 
Askania  System  and  the  two  FPS-16  Radar  Systems.  The  Contraves  System 
was  not  operated  on  the  earlier  tests.  Table  IX  shows  the  mode  number 
oi  instruments  that  make  up  the  Askania  System  for  each  flight  test.  Data 
from  the  first  trajectory  section  were  selected  from  the  third  flight  test 
so  as  to  approximate  more  closely  the  other  tests.  The  standard  devia¬ 
tion  estimates  for  the  Askania  system  are  smaller  for  the  X  and  Y 
component  data  for  the  later  two  flight  tests. 

Precision  estimates  for  data  from  the  earlier  flight  tests  for  radar 
systems  112  and  122  are  shown  in  Table  X.  These  standard  deviation 
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estimates  indicate  that  the  best  performance  for  the  radar  systems  was 
during  the  fourth  flight  test.  The  FPS-16  radars  were  operated  in  the  beacon 
tracking  mode  with  a  radar  beacon  aboard  the  tracked  missile. 

SUMMARY  AND  CONCLUSIONS.  The  standard  deviation  estimates  for 
the  position  data  ranged  from  two  to  nineteen  feet  for  the  cincthuodolitc 
systems  and  ranged  from  five  to  twenty-two  feet  for  the  FPS-16  radar 
systems.  This  Indicates  that  the  radar  system  position  data  precision  arc 
as  good  as  the  cinetheodolite  system  position  data  precision  for  these  flight 
test  data.  The  velocity  standard  deviation  estimates  ranged  from  two  to 
eleven  feet  per  second  for  the  cinetheodolite  systems  (exception  Contraves 
section  2  data)  and  ranged  from  three  to  sixteen  feet  per  second  for  the 
FPS-16  radar  systems.  Again,  a  precision  equivalence  for  velocity  data 
from  these  systems  can  be  stated.  The  acceleration  standard  deviation 
estimates  for  all  four  tracking  systems  ranged  from  eight  to  forty  feet  per 
second  squared  (with  the  exception  of  Contraves  section  2  data).  Again  an 
equivalence  can  be  stated  for  precision  of  the  acceleration  data  from  these 
systems. 

The  position  component  average  bias  were  based  on  the  trajectory  data 
from  the  Askania  cinetheodolite  system.  The  average  bias  for  position 
data  from  the  Contraves  cinetheodolite  ranged  in  absolute  (component) 
value  from  six  to  seventeen  feet  (except  for  section  2  data).  The  average 
bias  for  position  data  from  radar  122  ranged  in  absolute  (component)  value 
from  eight  to  thirty-eight  feet  and  from  radar  112,  the  average  bias  range 
in  absolute  value  from  a  low  of  23  to  7  3  feet.  Based  on  the  Askania 
cinetheodolite  position  data,  the  radar  systems  did  not  do  as  well  as  the 
Contraves  systems,  with  respect  to  bias  error  estimates.  The  average 
component  bias  for  the  derived  velocity  data  ranged  in  absolute  value  from 
zero  to  four  feet  per  second  for  the  Contraves  Bystem  and  ranged  in 
absolute  value  from  zero  to  three  feet  per  so  corn!  for  the  FPS-16  radar 
systems,  Essentially  the  average  velocity  bias  errors  are  equal. 

The  acceleration  component  bias  ranged  in  absolute  value  from  zero 
to  six  feet  per  second  squared  for  Contraves  system  and  from  zero  to 
seven  feet  per  second  squared  for  the  FPS-16  radar  systems.  These 
derived  acceleration  data  for  eleven  point  (two  second)  smoothed  data  are 
essentially  equal  in  average  component  bias  error. 
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TABLE  I 

PRECISION  ESTIMATES  FOR  TRAJECTORY  POSITION  DATA 
FROM  FLIGHT  TEST  NUMBER  FOUR 


Instrumentation 

System 

Trajectory 

System 

Component  Standard  Deviation 
Estimate  in  Feet 

North  (X)  East  (Y)  Up  <Z) 

Askania 

1 

5 

8 

10 

Askania 

2 

7 

3 

17 

Contraves 

1 

10 

2 

19 

Contraves 

2 

45* 

2 

67* 

Radar  112 

1 

12 

8 

16 

Radar  112 

2 

12 

5 

7 

Radar  122 

1 

9 

5 

22 

Radar  122 

2 

9 

8 

22 

♦Very  poor  geometry  for  a  two  instrument  (theodolite) 

system. 
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TABLE  II 


STANDARD  DEVIATION  ESTIMATES 
FOR  DERIVED  (SMOOTHED)  TRAJECTORY  DATA 
FROM  ASKANIA  CINETHEODOL1TE  SYSTEM 
FOR  FLIGHT  TEST  NUMBER  FOUR 


T  rajectc ry 
Section 

Derived 
Trajectory 
Element  * 

Dimensions 

Component  Estimates  of 
Standard  Deviation 

North  (X) 

East  (Y) 

Up  (21 

1 

position 

feet 

5 

8 

b 

2 

position 

feet 

5 

l 

13 

1 

velocity 

ft/scc . 

5 

4 

5 

2 

velocity 

ft./stc . 

6 

3 

11 

1 

acceleration 

ft/sec , ^ 

1  1 

8 

25 

2 

acceleration 

ft/sec  . 

15 

8 

40 

*A11  data  were  derived  from  mid-point  evaluation  of  a  second  degr  ee  least 
square  polynomial  fitted  over  a  two  second  interval  (11  points)  with  time 
as  the  independent  variable. 
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TABLE  III 

STANDARD  DEVIATION  ESTIMATES 
FOR  DERIVED  (SMOOTHED)  TRAJECTORY  DATA 
FROM  CONTRAVES  CINE  THEODOLITE  SYSTEM 
FOR  FLIGHT  TEST  NUMBER  FOUR 


iectory 

ction 

Derived 

Trajectory 

Element* 

Dimensions 

Component  Estimates  of 
Standard  Deviation 
North  (X)  East  (Y)  Up  ( 

1 

position 

feet 

5 

2 

10 

2** 

position 

feet 

19 

1 

34 

1 

velocity 

ft/sec. 

5 

2 

4 

2++ 

velocity 

ft/sec. 

25 

4 

43 

1 

acceleration 

ft/ sec .  ^ 

16 

3 

38 

2*+ 

acceleration 

it/sec. 

87 

12 

148 

‘i* All  data  were  derived  from  mid-point  evaluation  of  a  second  degree  least 
squares  polynomial  fitted  over  a  two  second  interval  (11  points)  with  time 
as  the  independent  variable, 

"""Poor  geometry  for  a  two  cinetheodolite  instrumentation  system. 


'*? — wpr  ■. 
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TABLE  IV 

STANDARD  DEVIATION  ESTIMATES 
FOR  DERIVED  (SMOOTHED)  TRAJECTORY  DATA 
FROM  RADAR  (112)  SYSTEM 
FOR  FLIGHT  TEST  NUMBER  FOUR 


Trajectory 

Section 

Derived 

Trajectory 

Element* 

Dimensions 

Component  Estimates 
Standard  Deviation 

of 

North  (X) 

East  (Y) 

Up  (2) 

1 

position 

feet 

10 

7 

13 

2 

position 

feet 

12 

4 

7 

1 

velocity 

ft/sec. 

10 

10 

10 

2 

velocity 

ft/aec. 

6 

'  4 

6 

1 

acceleration 

ft/sec. 2 

32 

26 

40 

2 

acceleration 

ft/sec. 

30 

6 

20 

*All  date  were  derived  from  mid-point  evaluation  of  a  aec 

ond  decree  least  1 

aquare  polynomial  fitted  over  two  eecond  interval  (11  points)  with  time  as 

the  independent  variable.  The  standard  deviation  estimates  are  based  on 

a  sample  of  fifty  (50)  trajectory  points  for  each  trajectory  section. 

479 

TABLE  V 

STANDARD  DEVIATION  ESTIMATES 
FOR  DERIVED  (SMOOTHED)  TRAJECTORY  DATA 
FROM  RADAR  ( 12?)  SYSTEM 
FOR  FLIGHT  TEST  NUMBER  FOUR 


Trajectory 

Section 

Derived 

Trajectory 

Elenrunt* 

Dimensions 

Component  Estimates  of 
Standard  Deviation 

North  (X) 

East  (Y) 

Up(Z) 

1 

position 

feet 

7 

4 

10 

2 

position. 

feet 

6 

2 

9 

1 

velocity 

ft/sec. 

6 

4 

16 

2 

velocity 

ft/sec. 

4 

3 

9 

1 

acceleration 

ft/sec .  ? 

10 

16 

44 

2 

acceleration 

ft/aec. 

10 

12 

30 

*A11  data  were  derived  from  mid-point  evaluation  of  a  aecond  degree  least 
squares  polynomial  fitted  over  a  two  second  interval  (11  pointe)  with  time 
aa  the  independent  variable. 
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TABLE  VI 


BIAS  ESTIMATES  FOR.  DERIVED  (ELEVEN  POINT  SMOOTHING)  DATA 
FROM  CONTRAVES  SYSTEM  FOR  FLIGHT  TEST  NUMBER  FOUR 


Trajectory 

Section 

Derived 

Trajectory 

Element* 

Bias 

Dimensions 

Component  Estimated 
Average  Bias** 
North  (X)  East  (Y) 

of 

Up  (2) 

1 

position 

feet 

-  6 

9 

-17 

2 

position 

feet 

-28 

13 

-82 

I 

velocity 

ft/sec, 

-  2 

1 

-  4 

2 

velocity 

ft/sec. 

0 

0 

-  2 

1 

acceleration 

ft/sec.? 

0 

-  1 

1 

2 

acceleration 

ft/sec. 

-  4 

0 

-  6 

♦See  not*  in  Table  II. 


*+The  trajectory  data  at  simultaneous  times  from  the  Askania  System  (chosen 
standard)  were  subtracted  from  corresponding  data  from  the  Contravcs  System 
to  form  an  error  set  of  data  which  were  averaged  for  each  trajectory  section. 
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TABLE  VII 

BIAS  ESTIMATES  FOR  DERIVED  (ELEVEN  POINT  SMOOTHING)  DATA 
FROM  RADAR  112  SYSTEM  FOR  FLIGHT  TEST  NUMBER  FOUR 


Derived 

Component  Estimates  of 

T  ra  jectory 

T  rajectory 

Bias 

Average  Bias*  * 

S  ection 

Element* 

Dimensions 

North  (X)  East  (Y)  Up  (Z) 

1 

position 

feet 

-55  -23  -52 

2 

position 

feet 

-73  -27  -41 

1 

velocity 

ft/aec. 

-  2  1  1 

2 

velocity 

ft/aec. 

-  2  -  1  0 

1 

acceleration 

/  2 
ft/aec. , 

ft/sec. 

-  1  0  2 

2 

acceleration 

-  1  -  1  -  3 

♦See  note  in  Table  II, 

■"♦The  trajectory  data  at  aimultaneoua  time*  from  the  Aakania  Syatem  (chosen 

•tandard)  were  subtracted  from  correaponding  data  from  Radar  112  Syatem 

to  form  an 

error  aet  of  data  which  were  averaged  for  each  trajectory  section, 

2 


TABLE  VIU 

BIAS  ESTIMATES  FOR  DERIVED  (ELEVEN  POINT  8MOOTHINO)  DATA 
FROM  RADAR  1 22  SYSTEM  FOR  FLIGHT  TEST  NUMBER  FOUR 


Trajectory 

Saetioa 

Darivad 

Trajectory 

Element* 

Bias 

Dimensions 

Component  Estimates  of 
Average  Bias** 

North  W)  East  (Y)  Up  (Z) 

1 

position 

fset 

•  38 

•  11 

31 

2 

position 

feet 

-21 

•  8 

26 

I 

velocity 

ft/ssc, 

0 

1 

0 

2 

velocity 

ft/aac. 

3 

0 

-  2 

1 

accale  ration 

ft/sec. | 

a 

I 

1 

,  7 

2 

acceleration 

ft/aac. 

0 

0 

•  2 

*Sae  MU  in  Table  0. 

**Tha  trijietery  data  it  etmuitaaoous  ttmii  from  tit*  Aikinii  System  (oHomb 
standard)  were  subtracted  from  corresponding  dote  from  Radar  122  Syetem 
to  form  an  orror  aat  of  data  which  wars  averaged  for  oaoh  trajectory  aaetion. 
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TABLE  IX 

COMPARISON  OF  ASKANIA  CINETHEODOLITE  SYSTEM 
BY  FLIGHT  TEST  PERFORMANCE 


Flight  Mode 

Test  Number  of 

Numbor  Cinetheodolite 

Component  Standard  Deviation 
Estimate  in  Feet 

North  (X)  East  (Y)  Up(Z) 

1  6 

2  7 

3*  7 

4+*  5 

11  11  8 

10  15  12 

7  4  10 

6  6  14 

^Trajectory  section  one  and  mode  number  of  instruments  corresponding 
more  closely  to  earlier  tests.  Average  set  for  the  three  trajectory 
sections  is  8,  8  and  12  respectively  for  Flight  Test  three. 

**The  first  three  flight  tests  were  night  tests  with  a  point  source  of 
light  for  optical  system  tracking;  whereas,  the  fourth  flight  test  was 
conducted  during  daylight  hours. 

* 


TABLE  X 

COMPARISON  OF  RADAR  SYSTEMS 
BY  FLIGHT  TEST  PERFORMANCE 


Flight 

Test 

Radar 

Component  Standard  Deviation 

Estimation  in  Feet 

Number 

Designation 

North  (X) 

East  (Y) 

Up  (Z) 

1 

R-112 

18 

46 

34 

2 

R-112 

25 

68 

92 

3 

R- 1 12* 

19 

39 

16 

4 

R-112** 

12 

7 

12 

1 

R-  122 

29 

29 

21 

2 

R-  122 

21 

18 

20 

3 

R-  122 

26 

34 

31 

4 

R-  122* 

9 

8 

22 

♦Variate  difference  estimates  for  trajectory  eection  1 ;  data  sampled  at 
2  per  second. 


♦♦These  radars  were  operated  in  the  beacon  tracking  mode;wherea», prior 
|  teste  have  utilised  the  skin  tracking  mode, 
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THERMAL  CYCLES  IN  WELDING 


Mark  M.  D'Andrea,  Jr. 

U.  S.  Army  Materials  Researrh  Agency 
Watertown,  Massachusetts 


INTRODUCTION:  The  mechanical  property  integrity  of  weld  heat- 

affected  nones  is  an  inherent  and  vital  consideration  in  arc  welding  applica¬ 
tions,  A  weld  heat-affected  rone,  hereinafter  termed  "weld-HAZ",  is 
defined  as  that  volume  of  base  material  in  a  weldment  that  has  been  heated, 
as  a  result  of  welding,  to  a  range  of  peak  temperatures  between  the  pre¬ 
heat  temperature  and  the  materials  melting  point. 

Previous  work  conducted  at  the  U.  S.  Army  Materials  Research 
Agency,  concerning  the  welding  of  fully  heat-treated  high-strength  t:tecls 
for  service  in  the  as-welded  condition,  demonstrated  that  weld-HAZ  aruan 
characterized  by  peak  temperatures  at  or  about  the  lower  critical  temper¬ 
ature,  suffered  a  marked  loss  of  strength,  thus  reducing  weld-joint 
efficiencies  considerably.  Other  studies  with  high-strength  and  mnraging 
stcolH  have  revealed  deleterious  mechanical  nronevty  effects  resulting 
from  thermal  cycles  charactcrixctl  by  peak  temperatures  above  the  upper 
critical  temperature.  In  addition,  it  is  well  known  that  an  embrittling 
uuurt  in  alloy  steels  is  generally  associated  with  weld-HAZ  structures 
characterized  by  peak  temperatures  between  the  lower  and  upper  critical 
temperature*. 

Recent  work  conducted  at  AMRA,  established  the  general  parameters 
necessary  to  define  and  reproduce  the  transformational  behavior  of  weld- 
HAZ  mic restructures.  The  so  parameters  included  (but  arc  not  necessarily 
limited  to)  the  following;  (l)  The  time -temperature  shape  of  a  weld-HAZ 
thermal  cycle,  (2)  the  peak  temperature  of  a  thermal  cyclo,  (3)  the 
microstructure  of  the  base  material  (defined  by  heat  treatment,  chemistry, 
working,  etc.),  prior  to  the  imposition  of  a  thermal  cycle,  and  (4)  factors 
affecting  restraining  stresses  and  strains  produced  in  the  base  material 
as  a  result  of  the  overall  welding  operation. 

The  gamut  of  mic restructure h  produced  in  *.  weld-HAZ  is  the  end 
result  of  the  complex  and  varied  transformations  caused  by  welding  thermal 
cycles.  An  important  consideration  which  has  been  a  pertinent  reference 
polrit  in  the  present  investigation,  was  the  fact  that  in  any  arc  weld  in  a 
given  material  there  will  always  be  thermal  cycles  that  have  the  same  peak 
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temperature;  these  thermal  cycles  wm  oilier  oniy  m  that  the  shape  and 
position  of  associated  heating  and  cooling  curves  will  be  displaced  some¬ 
what  in  time  and  temperature,  It  is  a  well  established  metallurgical  fart 
that  the  mechanical  properties  of  a  material  depend  primarily  upon 
microstructure.  In  order  to  predict  and  perhaps  control  weld-HAZ  micro¬ 
structures  resulting  from  welding  thermal  cycles,  it  is  necessary  first 
to  investigate  the  effects  of  basic  parameters  of  such  structures. 

OBJECT  AND  SCOPE; 


Welding  Metallurgy 

The  overall  objective  is  to  investigate  and  to  establish  basic  metal¬ 
lurgical  concepts  to  account  for  the  phenomena  of  the  attendant  transform;'  - 
tion  behavior  of  weld-HAZ  microstructures  produced  in  4340,  H  - 1 1 ,  and 
18  \/Z%  Ni  (300)  maraging  steels.  The  work  is  limited  to  a  study  of  the 
effects  of  fundamental  material  and  welding  time -temperature  parameters 
pertaining  to  single  pass,  arc  welding  situations,  Realizing  the  potentially 
staggering  number  of  general  and  sub-parameters  that  may  significantly 
affect  resultant  microstructure,  it  was  deemed  advisable  to  initiate  the 
investigation  by  studying  only  the  effects  of  some  of  the  general  parameters, 
viz;  the  prior  base  material  microstructure,  the  peak  temperature  of  a 
thermal  cycle,  and  the  time -temperature  shape  of  thermal  cycles  imposed 
by  welding.  The  number  and  kind  of  stress-strain  conditions  that  are 
applicable  to  welding  were  initially  considered  to  be  overwhelming; 
consequently  the  utilization  of  this  general  parameter  in  this  initial  investi¬ 
gation  was  abandoned  in  the  sense  that  such  conditions  were  kept  constant. 

Statistical  Inference 

The  overall  objective  of  the  utilization  of  statistical  inference 
techniques  is  to  assist  the  metallurgical  investigation  by  determining 
the  significant  factors  (i.e.  ,  the  more  critical  variables),  affecting  this 
phenomena,  and  to  detect  the  specific  significant  differences  that  may 
exist  among  each  set  of  significant  factors.  The  transformational  behavior 
and  the  resultant  heat-affected  zone  microstructures  produced  will  be 
evaluated  metallurgically  in  terms  of  such  specific  significant  differences 
obtained. 

iHE  EXPERIMENT.  A  high-speed  time-temperature  controller 
is  being  used  in  this  investigation  to  produce  weld-HAZ  synthetically. 
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The  control!**  *•  *»««ontially  is  s  simnlatinp  device  which  Dermits  the  duplica¬ 
tion  of  welding  thermal  cycles  experienced  in  weld-heat-.affected  zones,  Each 
specimen  is  heated  by  its  resistance  to  the  passage  of  an  A-C  electric  current 
furnished  from  a  transformer,  and  is  cooled  by  the  removal  of  heat  from 
the  specimen  by  conduction  through  water-cooled  copper  clamps.  A 
thermocouple  percussion  welded  to  the  surface  of  the  specimen,  provides 
a  signal  which  is  balanced  against  a  reference  control  signal  designed  to 
reproduce  the  desired  thermal  cycle.  The  resultant  error  signal  is 
amplified  and  utilized  by  the  controller  to  maintain  temperature  control 
during  the  cycle  to  within  +  5°F. 

The  basic  experiment  involves  two  of  the  general  parameters  as 
variables,  viz.  ,  the  prior  base  material  microstructure  (defined  by 
various  heat  treated  conditions  of  a  given  single  heat  of  steel)  and  the 
time-temperature  shape  of  various  welding  thermal  cycles.  The  thermal 
cycle  peak  temperature  parameter  is  a  constant  in  each  basic  experiment, 
i.e.  ,  each  basic  experiment  is  conducted  utilizing  thermal  cycles  having 
the  same  peak  temperature. 

In  each  basic  experiment,  it  is  desired  to  determine  the  effects  of 
prior  base  material  microstructure  (denoted  factor  code  "H").  and  the 
time-temperature  shape  of  thermal  cycles  (denoted  factor  code  "C"),  on 
the  notch  toughness  (quantitative  response  variable,  measured  in 
in.  -lb/in.  indicative  of  microstructural  change)  of  the  resultant  weld- 
HAZ  microstructures. 

In  a  given  heat  of  steel,  the  interest  lies  in  the  effects  of  five  partic¬ 
ular  prior  base  material  microstructures  and  seven  particular  chermal 
cycles,  i.e.  ,  factor  "H"  is  a  fixed  factor  at  five  fixed  levels  and  factor 
"C"  is  a  fixed  factor  at  seven  fixed  levels. 

There  are  three  steels  (one  heat  of  each)  involved  in  the  investiga¬ 
tion  along  with  seven  different  peak  temperatures  per  heat;  therefore, 
there  are  three  times  seven  or  twenty-one  basic  experiments  to  be 
evaluated  independently.  Metallurgical  considerations  preclude  statis¬ 
tical  correlations  between  steel  types  and  between  peak  temperatures 
per  heat  of  steel. 

THE  DESIGN  AND  ANALYSIS;  The  number  of  observations  (notch 
toughness  values)  to  be  taken  is  initially  unknown;  however  it  is  desired 
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to  design  the  statistical  analysis  to  allow  for  the  aeneral  situation  of  Heal¬ 
ing  with  an  uneven  number  of  replications  per  cell,  since  some  experimental 
observations  are  lost  occasionally.  A  basic  model  appears  to  be  a  fixed, 
two-way  analysis  of  variance;  the  suggested  mathematical  model  for  the 
sum  of  squares  is: 

Total  H  C 


Interaction  Residual 


Once  the  individual  ANOVA's  are  run  for  each  basic:  experiment,  one  of 
the  following  techniques  could  be  used  to  detect  specific  significant 
differences  that  may  exist  among  each  set  of  significant  factors  obtained. 

(1)  Use  Duncan's  Test  of  the  means  if,  and  only  if,  the  cells  have 
the  same  number  of  replications.  The  means  used  here  are  those  of 
the  columns,  or  rows  as  the  case  may  be,  of  the  cells  pertaining  tci  the 
significant  factor,  if  both  factors  are  significant,  two  such  tests  are 
made  regardless  of  lit  eraction  effects.  Perhaps  this  is  not  a  proper 
technique,  in  that  only  the  individual  cell  averages  should  be  tested  by 
Duncan's  method. 

(2)  Use  the  following  relationship  to  test  the  means  of  each  cell  if 
there  are  minor  variations  in  the  number  of  observations  per  cell, 


x^^wir^aapwssr.  *  -tr  «*■*» 
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(3)  Use  the  following  relaHnn  shin  t,i  foot-  tK«.  ™  "  ^  __tt  :  r 

there  are  major  variations  in  the  number  of  observations  per  cell. 

sVzr  +  h  *  Y(k-i)r(k-i,y)  . 

X  b 


The  foregoing  is  the  author's  suggested  method  of  analysis.  It  is 
important  to  note  that  the  author  is  merely  a  novice  at  this  business  of 
statistical  analysis. 

It  has  been  suggested  since  the  presentation  of  this  paper  that  the 
use  of  regression  analysis  techniques  may  be  a  better  approach  to  solving 
this  statistical  p.-oblem.  Unfortunately,  circumstances  to  date  have  not 
yet  permitted  a  further  investigation  into  the  most  efficient  statistical 
procedures  to  be  used  in  this  problem. 


STATISTICAL  ANALYSIS  OF  TENSILE  STRENGTH -HARDNESS 
RELATIONSHIPS  IN  THERMOMECHANIC  ALLY 
IKtAitu StEEL5v 

Albert  A.  Anctil 

U,  S.  Army  Materials  Research  Agency 
Watertown,  Massachusetts  02172 


INTRODUCTION.  Generally  speaking,  statistical  analysis  finds  limited 
applications  in  metallurgical  problems.  This  is  true  because  the  sample 
size  is  usually  quite  small  and  in  most  cases,  the  variables  are  known  and 
can  be  controlled.  The  clinical  (statistical)  problem  described  here  is  a 
segment  of  an  investigation  entitled,  "Tensile  Strength-Hardness  Relation¬ 
ships  in  Thermomechanically  Treated  Steels.  "  [1]  The  objective  of  the 
study  was  to  determine  metallurgically  and  statistically  how  well  thermo¬ 
mechanically  treated  steels  followed  established  tensile  strength-hardness 
correlations. 

The  generally  accepted  tensile  strength-hardness  correlations  are 
published  by  the  American  Society  for  Testing  and  Materials  (ASTM)  [2] 
and  the  So'ciety  of  Automotive  Engineers  (SAE)  [3]  .  These  correlations 
specifically  excluded  cold  worked,  stainless  steels  and  other  thermo¬ 
mechanically  treated  steels.  The  ASTM  and  SAE  correlations  have  been 
obtained  from  a  particular  steel  quenched  and  tempered  to  various  strength 
levels.  Tensile  specimens  which  contain  hardness  coupons  were  machined 
from  each  strength  level  condition.  These  specimens  were  distributed 
randomly  to  several  laboratories  participating  in  a  standardized  testing 
program.  The  assembled  data  were  treated  statistically  to  obtain  a 
tensile  strength-hardness  correlation. 

Thermomechanical  treatments  which  are  under  consideration  here, 
involve  the  introduction  of  cold  work  into  the  heat  treatment  cycle  of 
steel  to  obtain  higher  strengths.  There  are  three  types  of  thermomechan¬ 
ical  treatments  based  upon  when  in  the  heat  treatment  cycle  the  working 
cycle  is  performed,  [4] 

Type  I  -  Deformation  of  austenite  followed  by  transformation 

Type  II  -  Deformation  of  austenite  during  transformation 

Type  III  -  Deformation  after  transformation  of  austenite 


'Comments  on  this  paper  by  one  of  the  panelists  can  be  found  at  the  end 
of  this  article. 
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EXPERIMENTAL  PROCEDURE.  The  experimental  tensile  strength- 
hardness  data  came  trom  a  literature  survey  ui  Inc  i  iuuutcch»mC ally  trcitsd 
steels.  Refer  to  Reference  1  for  a  more  detailed  explanation  and  data 
references  for  this  presentation. 

Figure  1  shows  the  ASTM  (solid  curve)  and  SAE  (dashed  curve)  tensile 
strength-hardness  correlations.  There  is  some  difference  of  opinion  as 
to  which  is  the  better  curve.  A  joint  ASTM-SAE  committee  is  presently 
working  out  a  compromise  curve.  The  ASTM  curve  has  been  extended 
beyond  Rockwell  C  hardness  58  to  encompass  the  very  high  strength  steels. 
The  data  points  are  from  Reference  1  and  represent  various  steels  having 
a  quenched  and  tempered  heat  treatment,  Such  data  could  have  been  used 
to  obtain  these  correlations.  These  same  steels  were  then  processed 
thermomechanically  with  Type  I  (open  symbols)  and  Type  III  (closed  symbols) 
treatments,  Statistically  the  quenched  and  tempered  data  fits  the  ASTM 
correlation  better  than  the  SAE  correlation.  Accordingly,  the  ASTM 
correlation  will  be  used  for  comparative  purposes. 

Tensile  strength-hardness  data  for  the  Type  1  thermomechanical 
treatment  are  shown  in  Figure  2.  The  thermomechanical  heat  treatment 
cycle  is  shown  schematically.  The  data  follow  the  ASTM  correlation 
(solid  curve)  reasonably  well.  Figure  3  illustrates  Type  II  data.  This 
thermomechanicftl  treatment  is  usually  periormed  on  austenitic  stainless 
steels  at  subzero  temperatures.  Meaningful  comparisons  of  this  data  are 
difficult  with  such  a  small  sample  size.  Type  III  data  are  shown  in  Figure 
4.  The  cold  work  may  be  performed  upon  the  asquenched  martensite  or 
upon  tempered  martensite  that  is  subsequently  aged.  A  positive  deviation 
from  the  ASTM  correlation  is  immediately  apparent  over  the  major  por¬ 
tion  of  the  hardness  range  for  Type  III  data. 

Selected  data  for  Type  III  treatments  where  the  percent  reduction  has 
been  varied  are  shown  in  Figure  5.  Consider  the  5Cr-Mo  steel  where 
the  lowest  tensile  strength  plotted  represents  the  quenched  and  tempered 
condition.  Note,  that  as  the  amount  of  cold  work  is  increased,  the 
tensile  strength  increases  at  a  faster  rate  than  that  shown  by  the  ASTM 
correlation.  This  same  trend  can  be  seen  for  the  majority  of  these 
steels.  It  is  for  this  reason  that  a  regression  line  was  not  drawn  for 
this  data.  A  tensile  strength-hardness  correlation  for  these  steels  would 
be  dependent  upon  the  amount  of  cold  work. 


' ^ i ■  all  ft-*  ■ 
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DISCU5SJ0N.  Metallurgically  the  behavior  was  explained  using 
Tabor's  analysis  [5]  which  relates  hardness  and  tensile  strength  through 
an  additional  parameter,  the  strain  hardening  cApuucui  n.  The  analysis 
is  summarized  in  Figure  6.  Quenched  and  tempered  steels  have  strain 
hardening  exponents  in  the  range  from  0.04  to  0.12,  In  this  range  the 
tensile  strength-hardness  ratio  is  nearly  constant.  It  is  for  this  reason 
that  a  unique  tensile  strength-hardness  correlation  exists.  For  Type  I 
treatments  the  strain  hardening  exponents  fall  in  the  same  range,  there¬ 
fore,  the  data  fit  the  ASTM  correlation.  With  Type  III  treatments  the 
ratio  starts  at  the  minimum  and  increases  as  the  exponent  decreases  to 
nearly  zero  with  increasing  amounts  of  cold  work.  This  results  in  posi¬ 
tive  deviations  from  the  ASTM  correlation.  Type  11  treatments  are 
usually  performed  on  austenitic  stainless  steels  at  subzero  temperatures. 
These  steels  have  ver/  high  exponents  (0.  3)  in  the  annealed  condition  which 
decrease  to  nearly  zero  with  increasing  amounts  of  deformation.  One 
would  expect  positive  deviations  from  the  high  and  low  exponents  and 
adherence  to  the  correlation  as  the  ratio  passes  through  its  minimum  value, 
Cold-worked  steel  (Type  III)  and  stainless  steels  (Type  II)  have  been 
excluded  from  the  ASTM  correlation  because  of  these  drastic  changes 
in  strain  hardening  characteristics, 

Statistical  analysis  of  the  data  is  summarized  in  Table  I.  The 
deviation  d  refers  to  the  experimental  tensile  strength  <r  ,  minus  the 
corresponding  tensile  strength  from  the  ASTM  correlation  at  a 

particular  hardness.  This  deviation  wae  determined  for  every  data  point. 
The  arithmetic  mean  of  the  deviations  Air  was  taken  as  the  sum  of 
the  deviations  divided  by  the  sample  size.  It  serves  as  an  indication  of 
how  well  the  data  for  thermomechanicaliy  treated  steels  fit  the  ASTM 
correlation.  This  value  would  be  zero  for  a  regression  line  of  the  data. 

The  absolute  deviation  |  Ac!  and  the  standard  error  of  estimate  Sy,'‘ 
were  calculated  as  measures  of  the  dispersion  of  the  data  about  the  ASTM 
curve.  These  differ  from  the  usually  defined  mean  absolute  deviation 
and  standard  error  of  estimate  which  measure  the  dispersion  around  a 
regression  line. 

_  Statistical  results  are  shown  in  Table  II.  The  mean  of  the  deviations 

Act  ,  shows  a  better  fit  of  the  quenched  and  tempered  data  about  the 
ASTM  rather  than  the  SAE  correlation.  Further,  the  data  for  the  Type  I 
treatment  fit  the  ASTM  correlation  better  than  the  Type  III  treatments. 
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Aiao,  the  predominantly  positive  deviation  of  the  Type  III  data  from  the 

A  CTW  .......  J  .  1 - ; _ _  o  ~«.v  —  „  t  U - -  ACTU 

«  *  IV*  s,  m  *  v  v  *  w  v  w  T  aw  u  •  <  a/  w  wit  m«  W  w  w*  vuv  .  .w  .  ... 

curve  yield  approximately  the  same  results.  They  <b  not,  however,  reflect 
the  positive  deviation  of  data  for  the  Type  III  treatments. 

The  problem  before  the  panel  is  that  of  offering  more  descriptive 
statistical  alternatives  for  comparing  several  populations  of  data  (tensile 
strength-hardness  values  for  the rmomechanically  treated  steels)  to  a 
given  regression  line  (the  standard  AST M  tensile  strength-hardness 
correlation).  Consider  further  that  it  may  not  be  possible  or  meaningful 
to  draw  a  regression  line  through  each  population  of  data. 
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Table  II.  STATISTICAL  RESULTS  FOR  QUENCHED  AND 
TEMPERED  AND  THE RMOMECHAN I CALL Y  TREATED  STEELS 
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The  evaluation  of  empirical  relations  of  the  kind  you  discussed  is  a 
difficult  problem.  The  various  functions  of  deviations  from  the  ASTM 
curve  that  are  presented  in  your  Table  II  are  extremely  difficult  to 
interpret:  By  themselves,  they  are  nearly  meaningless.  Taken  together 
with  the  data,  as  exhibited  in  your  figures,  they  add  very  little  and  may 
be  misleading. 

For  example,  looking  at  Figure  1,  I  notice  that  the  steels  u9ed  in 
Type  I  and  Type  III  thermomechanical  treatments  respectively  Beem  to 
be  grouped  preponderantly  in  different  hardness  ranges.  Is  it  possible 
that  the  ASTM  curve  fits  better  for  Type  I  and  the  SAE  curve  for  Type 
III?  If  this  were  so,  an  explanation  would  have  to  be  sought  in  the 
metallurgical  facts  about  the  data  used,  and  in  the  history  of  the  two 
standard  curves. 

Table  II  gives  overall  measures  of  goodness-of-fit.  Since  these  are 
well-defined  functions  of  the  data,  they  cannot  be  "wrong"  in  themselves. 
But  if  the  deviations  from  the  standard  curves  occur  for  different  reasons 
if  different  types  of  steels  and  in  different  hardness  ranges,  the  overall 
measures  cannot  be  relied  upon  to  describe  the  uncertainty  of  tensile 
strength  estimates  derived  (using  the  curves)  from  hardness  measure¬ 
ments,  Furthermore,  if  the  overall  measures  are  used  to  select  the 
"best-fitted"  curve,  there  is  great  danger  that  the  resulting  curve  will 
have  systematic  errors  arising  from  the  particular  choice  of  data, 

Of  course,  for  many  purposes  a  standard  curve  is  entirely  adequate. 
But  your  data  seem  to  make  it  clear  that  one  possible  long-run  goal  would 
be  the  development  of  a  collection  of  curves  each  applicable  to  specific 
circumstances.  This  development  would  probably  require  the  perform¬ 
ance  of  many  new  experiments,  It  could  lead  to  the  evolution  of  your 
qualitative  explanation  of  the  behavior  of  thermomechanically  treated 
steels  into  a  quantitative  explanation. 

The  statistical  measures  quite  properly  play  a  very  small  role  in 
your  valuable  summary  of  the  published  evidence  on  tensile -strength/ 
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hardness  relationships.  I  am  sure  that  in  future  studies  you  will  continue 
to  be  guided  by  the  totality  of  scientific  information  available  to  you,  and 
I  hope  you  will  often  find  that  statistical  techniques  are  helpful  in  data 
analysis. 


SOME  PROBLEMS  IN  STATISTICAL  INFERENCE  FOR 


I'TMr'n  A  I^T-7C-ni 


\a r i  i  TiKjnvn  »  i 


riOr,,,L  A.’r,OMc: 


Bernard  Harris 
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INT RODUCTION.  Assume  that  a  random  sample  of  size  N  has  been 
drawn  from  a  "multinomial  population"  with  an  unknown  and  possibly 
countable  infinite  number  of  classes.  That  is,  if  X.  is  the  ith  observa¬ 
tion  and  M.  the  jth  class,  then 
J 

P  { X .  e  M .  }  =  p .  >  0 ,  j  =  1 ,  2  ,  ...  ;  i  =  1, 2 . N  , 

i  J  J  - 

00 

and  2  p.  =  1,  The  classes  are  not  assumed  to  have  a  natural  ordering. 
i=l  1 

Let  bp  the  number  of  classes  occurring  exactly  r  times  in  the  sample. 
Then,  we  clearly  have 


£  r  n  =  N  , 


Vv  e  will  be  concerned  with  estimating  the  following  two  quantities 
which  are  generally  of  interest  to  experimenters. 

(1)  The  sample  coverage,  defined  by 

(1)  C  =  2p,  , 

where  the  sum  runs  over  all  classes  which  have  occurred  at  least  once 
in  the  sample. 

(2)  The  population  entropy,  defined  by 

00 

H  :  £  p.  log  p  . 

i=l 


(2) 
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It  will  be  convenient  in  our  definition  of  entropy  to  violate  the  usual  conven¬ 
tions  and  use  natural  logarithms  rather  than  logarithms  to  base  2.  This  is 
equivalent  to  a  scale  change  in  units  of  measurement  and  will  have  no 
essential  effect  on  any  uses  for  which  the  entropy  might  be  employed.  Of 
course,  we  will  assume  throughout,  that  the  series  (2)  converges,  since 
otherwise  the  discussion  will  not  be  relevant. 

In  those  problems  which  present  difficulty,  namely  where  too  many 
of  the  p^'s  are  too  small,  small  sample  inference  appears  to  be  virtually 

hopeless,  hence,  all  results  described  herein  will  be  asymptotic  results, 
i .  e .  for  large  N. 

Estimation  of  H  and  C.  For  the  moment,  we  will  restrict  to  the  case 
of  an  ordinary  multinomial  population,  that  is,  one  with  finite  number, 
k,  of  classes.  Then  the  "natural  estimator"  of  entropy  H  is  defined  by 

A  N  in.  n.  k 

(3)  H  =  -  S  log  ^  log  p. 

i=l  i=l 

where  p.  is  the  maximum  likelihood  estimator  of  p.  . 
ri 

Its  properties  has  been  discussed  by  G.  P,  Basharin  [l]  and  we  note 
them  briefly.  Basharin  showed  that 

(4)  E(H)  =  H  -  ~  +  0(N'2) 
and 

(5)  ^  2  [p.  log2  p£  -  H2]  +  0(N‘2) 

1=1 

and  /"N(H  -  H)  is  asymptotically  normally  distributed.  If  we  attempt 
to  apply  Basharin's  results  to  the  more  general  case  described  earlier, 
it  is  easily  seen  that  the  naive  replacement  of  p^  by  p.  in  (2)  may  not 

be  successful.  Essentially,  Basharin's  technique  depends  on  the  following 
sort  of  asymptotic  behaviour, 
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as  N  -*oo,  Np .  x  ,  i  =  1 , 2 , .  .  ,  k 
1 

Consequently,  if  we  have  zero  as  a  limit  poim  of  the  p/s,  or  even,  if 
we  have  the  limiting  behaviour  associ-- i cu  with  the  Poisson  approximation, 

a  s  N  -*  «  ,  p .  -*  0  ,  Np .  -*  X. . ,  0  <  X. .  <  »  , 

l  'll  i 

for  a  sufficiently  large  number  of  classes,  then  Basharin's  estimator, 

H,  may  be  quite  poor,  The  following  illustration  will  exhibit  this.  Let 
1  2 

p.  =  — -  ,  i  =  1, 2,  .  .  .  ,  N  .  Then  H  =  2  log  N.  However,  since  the  maxi- 
'  ‘  N  i 

mun  of  II  occurs  for  p.  =  —  ,  when  there  are  k  classes,  we  have  that 
.  1  k 

H  <  log  N.  Hence,  it  is  quite  clear,  that  if  there  are  "too  many  classes 
whose  probabilities  are  too  small",  H  will  not  be  a  satisfactory  estimator. 
One  of  the  causes  of  the  difficulty  is  that  H  gives  no  weight  to  unobserved 
cells,  so  that  if  the  total  probability  in  unobserved  cells  is  large,  H  will 
not  perform  too  well. 

We  can  gain  some  insight  in  dealing  with  this,  if  we  examine  the 
second  question  we  advanced,  the  estimation  of  the  sample  coverage. 

This  problem  is  discussed  in  greater  detail  in  B,  Harris  [3]  ,  but  it  is 
convenient  at  this  time  to  make  some  intuitive  observations  concerning  the 
estimation  of  C,  so  that  we  can  resolve  the  difficulties  noted  above  in  the 
estimation  of  H. 

First,  note  that  if  we  were  to  proceed  as  Basharin  did  and  set 


A 

then  we  have  that  C  =  1  for  all  samples,  which  is  clearly  inappropriate. 

We  can  guide  our  intuition  by  first  examining  some  extreme  cases. 

(1)  If  n^  =  N,  then  we  readily  reach  the  conclusion  that  C  must  be 
small,  We  can  see  this  as  follows.  If  we  now  take  another  observation, 
inasmuch  as  every  past  observation  resulted  in  a  new  class  being  observed, 
it  is  apparent  that  with  probability  quite  close  to  unity,  the  N+lth  observa¬ 
tion  will  also  result  in  a  new  class.  In  fact,  the  probability  that  the 
N+lth  observation  will  not  result  in  a  new  class  is  C,  which  of  course 
should  be  near  0,  as  noted. 
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(2)  If.  on  the  other  hand,  there  is  an  integer  t,  substantially  larger 


than  one ,  such  that  n^  = 


n^  j  =  0,  n^  >  0.  Then,  similar 


reasoning  would  lead  us  to  conclude  that  most  of  the  probability  is 
concentrated  in  classes  with  high  probability,  and  therefore  C  should 
be  near  unity. 

(  3)  Let  p.  =  ^  ,  i  =  1 , 2,  .  .  .  ,  N.  Then  E(n^)  ^-Ne  ^  and  E(nQ)^Ne  ^ . 
Thus,  we  should  have  C^l  -  e  \ 

In  short,  as  is  shown  in  B.  Harris  [3]  ,  it  is  the  low  order  occupancy 
numbers,  such  as  n^,  n^,  and  n^,  which  contain  the  principal  information 

concerning  the  probability  content  of  unobserved  classes,  A  cursory 
examination  of  the  three  examples  cited  above  suggest  that  an  appropriate 
estimator  for  C  is  given  by 


C  =  1  -  — - 

N 


In  Harris  [3]  ,  it  is  shown  that  C  is  in  fact  an  suitable  estimator,  in 
that  it  has  good  asymptotic  properties. 

In  E.  B.  Cobb  and  B.  Harris  [2]  ,  a  method  for  estimating  entropy, 
when  "all  the  sample  information  is  contained  in  the  low  order  occupancy 
numbers"  was  exhibited.  In  order  to  do  this,  we  will  show  that  we  can 
represent  entropy  asymptotically  by 


H  =  ECnj)  §  e*  log  )  dF>:-(x) 


whe  re 


F*(x)  = 


Np.<x 

j— 


Np.e'Npj 
J  / 


Np.e-Npj 

J 


It  is  easily  verified  that  F,;,(x)  is  a  cumulative  distribution  function. 
Since 
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MS 


(9) 


E(n{)  -v  E  Np.e'Npi 


substitution  of  (8)  and  (9)  into  (7)  produces 


»  Np  -Np 

ij-  E  e  J  log  (-i-)Np  e  J 

j=l  \j 


E  p,  log  p.  =  H 

J-l  J  J 


which  verifies  (7), 

Under  the  assumptions  stated  above  Cobb  and  Harris  [3]  suggested 
that  the  entropy  be  estimated  by 


UO)  H  =  ^ 


nx  (N-m^ 


(N-m^2  +  (m2-m12) 


log[' 


2 

where  and  m2  *  max  (ir^  ,  6n3/np. 

At  this  point  it  is  worthwhile  to  present  a  numerical  example,  which 
will  illustrate  the  behavior  of  H. 

Example  ~  ,  i  =  1,  2, .  .  ,  N.  Then  n^-Ne  \  ~  ~  e  \  and 


N  -1  ,  , 

n,  ~  ~T  e  •  Thus ,  m.  m, ,  m.-v  1  and 
Jo  1  c 


F*(x)  = 


0  x  <  1 
1  x  >  1 


Then 


-1 


j«du  («-»/(«-» .  log  n 

(N-l)2 


and  H  =  log  N,  which  is  as  it  should  be. 
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Clearly,  it  is  principally  the  classes  with  small  probabilities  that 
contribute  to  n  ,  n.  ,  n.,  and  n  .  For  those  classes  with  large  probabilities, 

U  *  U  J 

we  can  estimate  by  p^. 

Then,  the  natural  way  of  proceeding  is  to  estimate  the  contribution 
to  entropy  from  large  classes  by  means  of  Basharin's  method  and  the 

w 

contribution  of  small  classes  by  H,  and  we  denote  the  final  estimator  by 

w 

H*.  Recall  that  in  order  to  use  H,  we  have  taken  n  ,  n  ,  and  n  to  deter- 
mine  H. 

There  is  one  last  detail  which  must  be  taken  into  account.  Part  of 
the  contribution  to  moderate  order  occupancy  numbers,  such  as  n^,  n^, 

and  some  of  the  succeeding  occupancy  numbers,  will  be  due  to  classes 
with  small  probabilities  and  the  effect  of  sample  fluctuations.  There¬ 
fore,  we  need  to  examine  the  following.  What  proportion  of  each 
n^,  j  =  4,5,.,.  ,s,  s  some  sufficiently  large  integer,  is  due  to  a  large 

deviation  from  a  class  with  small  probability?  We.  can  adopt  a  Theorem 
due  to  A.  Wald  [4]  obtaining  the  following  inequalities. 

, .  k-2.  k-1 

2  )n3 

(11)  if  m2  >  rr^  ,  E(nR+1)  >  - ,  k=3,4,...; 

(k+l).'n2 

2k  k 

(12)  if  m2  =  mj  ;  E(nk+1)  >  k.i  '  h  =  3,4, - 

(k+1).' 


The  right  hand  side  of  each  inequality  gives  the  expected  values  of 
n  ,  if  "the  sample  information  iB  contained  in  n  ,  n  ,  and  n  ".  Thus 

KTj.  L  j 

the  difference  between  the  left  and  right  hand  sides  of  (11)  and  (12)  gives 
an  estimate  of  the  contribution  to  which  is  due  to  classes  with  larger 

probabilities.  We  apply  Basharin's  estimator  (3)  to  these,  upon  replacing 
the  expected  values  in  the  left  hand  sides  of  (11)  and  (12)  by  the  observed 
values. 


I  1< 
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T'mr  ,  we  finally  write 


(1.5) 


H-  =  X  H  +  (l-\  )  it 


where  0  <  \  <  1  is  the  proportion  of  the  observations  in  n  ,  n.,,  and  n  , 

—  —  l  t*  li 

and  the  parts  of  n  ,  n- . n  determined  by  (ll)  and  (12).  For  the 

4  5  s  ^ 

parts  of  the  sample  allocated  to  sixiall  classes  as  noted  above  we  use  1-1 , 

and  use  H  on  the  part  allocated  to  larye  classes. 

The  mathematical  cictails  will  be  given  in  a  later  puoiicalior.. 
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APPLICATION  OF  NUMERICAL  TECHNIQUES 

TO 

EXPERIMENTALLY  MODEL  AN  AERODYNAMIC  FUNCTION* 

Andrew  H.  Jenkins 

Physical  Sciences  Laboratory.  Directorate  of  Research  and  Development 
U.  S.  Army  Missile  Command,  Redstone  Arsenal,  Alabama 


ABST RACT .  This  report  describes  the  use  of  an  aeroballistic  range 
in  the  design  and  execution  of  an  aerodynamic  experiment,  the  analysis 
of  the  experimental  data  by  numerical  techniques  to  develop  a  model  of  a 
physical  function,  and  the  statistical  testing  of  the  data  and  the  model. 

The  report  discusses  the  approach,  the  experimental  design,  and  the 
testing  of  the  data  using  several  frequency  distributions.  It  presents  and 
describes  a  multivariate  nonlinear  regression  analysis  performed  on  the 
data,  the  physical  model  developed  by  the  regression  analysis,  and  the 
testing  of  the  model.  It  also  lists  and  presents  the  tests  of  hypotheses 
made  and  discusses  the  results  of  the  tests. 


SYMBOLS 

Acoustic  velocity  in  air 
Pure  constant  of  regression  equation 
First  coefficient  of  regression  equation 
Counts  per  inch  of  photoreader  =  3502 
Second  coefficient  of  regression  equation 
Coefficient  of  specific  heat  at  constant  pressure 
Coefficient  of  specific  heat  at  constant  volume 
Statistical  degrees  oi  freedom 
Frequency  distribution 

Magnification  factor  of  shadowgraph  =  1.  009 
Magnification  factor  of  Schlieren  -  0  S55 
Ratio  of  shock  density  pg  to  free  stream  density  p° 
Natural  logarithm  (base  e) 


■i'This  article  was  initially  issued  as  U.  S.  Army  Missile  Command  Report 
No.  RR-TR-65-11. 


520 


M 

M 


Mz 

m3 

M. 

Mq 

MR,. 

ij 

N 

P 

r 

Ro 

R 

R1 

*2 

R„ 


n 


<1 
S 

c2 


Design  of  Experiments 

SYMBOLS  (continued) 

Mach  number  =  V/a 

Mach  factor  level  =  1.1  to  1.  5 

Mach  factor  level  =  2,  5  to  2,9 

Mach  factor  level  =  3.  9  to  4,  3 

Mach  factor  effect  in  statistical  equation 

Mach  factor  linear  effect 

Mach  factor  quadratic  effect 

Main  factor  interaction  effect 

Total  observation 

Statistical  probability 

Regression  correlation  coefficient 

Universal  gas  constant.  =  1715  sq.  ft/sq.  sec./°R. 

Radius 

Model  nose/base  radius  ratio  =  1.0 
Model  nose/base  radius  ratio  =  ,7 

Model  nose/base  radius  ratio  =  0.4 
Model  base  radius  =  0,112  inch 
Radius  factor  effect  in  statistical  equation 
Nose  radius  of  model 
Model  nose  to  base  radius  ratio 
Radius  factor  linear  effect 
Radius  factor  quadratic  effect 
Surface  roughness  of  model 
Experimental  sample  variance 
Experimental  sample  standard  deviation 
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SS 


t 


T 

V 

X 


aw 


X 

1 


xr 


X2,  3 
Y 


Ck(ij) 

? 


Sum  of  squares 

Value  of  students  frequency  distribution 
Absolute  temperature  (‘Rankine) 

Flight  model  velocity 
Mean 

Mean  of  Ambrosio-Wortman  model 
Mean  of  experimental  responses 
ith  response 

Mean  of  regression  model  responses 

Dependent  variable  of  regression  equation  (computer  language) 

Independent  variable  of  regression  equation  (computer 
language) 

Normal  frequency  distribution 

Type  I  error  risk  level 

Type  II  error  risk  level 

Ratio  of  specific  heats  =  c  /c 

p  v 

Shock  detachment  distance  from  shadowgraph  optical  system 

Shock  detachment  distance  from  Schlieren  optical  system 

Shock  detachment  distance  in  photoreader  counts  (corrected) 

Experimental  error 

Variance  of  experimental  responses 

Variance  of  regression  model 

Variance  of  Ambrosio-Wortman  model 

Universal  means 

Frequency  distribution 
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1.  INT RODUCT ION .  A  number  of  new  aerodynamic  problems  have 
come  into  prominence  in  recent  years.  The  source  of  the  problems  has 
been  the  very  high  flight  velocities  achieved  by  use  of  rockets.  The 
characteristics  of  the  problems  of  the  very  high  flight  velocities,  referred 
to  as  supersonic  or  hypersonic  flight,  are  those  of  a  hydrodynamic  nature. 
The  Mach  numbers  are  high  and  problems  of  a  physical  and  chemical 
nature  also  exist  because  the  energy  of  the  flow  is  large.  The  gases  are 
rarefied  so  that  the  mean  free  path  is  not  negligibly  small  compared  with 
an  appropriate  macroscopic  scale  of  the  flow  field.  Under  such  condi- 
tions,  kinetic  thoery  is  included  with  the  hydrodynamics. 

The  new  features  of  a  hydrodynamic  nature  allow  the  use  of  certain 
simplifying  assumptions  in  developing  theories  for  hypersonic  flow.  On 
the  other  hand,  certain  important  features  which  appear  introduce  addi¬ 
tional  complications  over  those  met  within  gas  dynamics  at  more  mod¬ 
erate  speeds.  The  techniques  of  linearisation  of  the  flow  equations  and 
the  use  of  mean- surface  approximation  for  boundary  conditions  have  a 
diminishing  range  of  applicability.  Also,  entropy  gradients  produced  by 
curved  shock  waves  make  the  classical  isentropic  irrotational  approach 
inapplicable . 

The  additional  problems  of  a  physical  and/or  chemical  nature  are 
associated  with  the  high  temperatures  of  the  flow  as  the  gases  traverse 
the  strong  bow  shock  wave,  The  sudden  shock  heating  of  the  gases 
excite  a  the  vibrational  degrees  of  freedom  of  the  molecules  resulting  in 
dissociation  of  the  species  into  atoms,  electrons,  and  ions  which  do  not 
require  treatment  at  lower  velocities.  Therefore,  it  must  be  recognised 
that  physical  phenomena  rather  than  hydrodynamic  phenomena  may  not 
only  influence  the  flow  but  in  many  cases  control  it.  In  view  of  the 
complexities  of  the  flow  at  high  Mach  numbers  and  the  number  of  technical 
disciplines  involved,  many  have  resorted  to  experimental  or  empirical 
development  of  functional  relationships. 

The  flow  field  originates  at  the  bow  shock.  The  shock  wave  charac¬ 
teristics  are  very  important  to  the  stagnation  region  characteristics. 

The  volume  of  the  stagnation  region  is  dependent  on  the  shock  detachment 
distance.  Therefore,  much  of  the  knowledge  of  the  flow  characteristics 
is  dependent  on  the  knowledge  of  the  shock  location.  Experiments  have 
been  performed  on  wind  tunnels  to  study  the  shock  location.  However, 
few  experiments  have  been  made  to  study  this  problem  under  free  flight 
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conditions.  Also,  the  studies  which  have  been  made  and  the  derived 
relationships  are  lacking  as  tests  have  not  been  attempted  to  determine 
their  reliability. 

It  is  apparent  that  the  community  recognizes  the  need  for  improved 
hypersonic  design  theory.  One  of  the  important  areas  is  the  prediction 
of  shock  detachment  distance.  It  is  important  to  the  computation  of  not 
only  heat  transfer  but  also  pressure  distributions  and  drag  on  the  fore¬ 
part  of  the  vehicle.  This  has  been  pointed  out  by  Serbin  [l]  ,  Ambrosio 
and  Wortman  [2]  ,  DiDonato  and  Zondek  [3]  ,  Heberle,  Wood,  and 
Gooderum  [4]  ,  and  Love  [5]  . 

The  lack  of  purely  theoretical  models  for  the  prediction  of  shock 
detachment  distance  at  transonic  and  supersonic  velocities  has  led  to  the 
natural  consequence  of  an  experimental  approach.  This  is  to  be  expected 
and  in  addition  the  theoretical  hypothesis  is  inevitable  subject  to  exper¬ 
imental  verification.  For  this  reason,  one  can  also  expect  to  contribute 
to  scientific  progress  by  the  inverted  approach  of  formulating  models  of 
the  mathematical  relationships  between  physical  variables  by  experimen¬ 
tation.  However,  the  relationships  derived  are  subject  to  experimental 
control,  measurement  accuracy,  human  error,  and  many  other  sources 
of  unexplained  or  unaccounted  for  deviations  from  the  true  universal 
relationships. 

In  the  direct  approach  (i.  e,  ,  the  a  priori  derivation  of  a  mathemati¬ 
cal  model  )  quite  often  ideal  physical  conditions  are  assumed  and  simplify¬ 
ing  mathematical  assumptions  are  made  which  depart  from  the  real  case. 
Therefore,  one  cannot  be  sure  of  the  theory  nor  can  one  be  certain  of  the 
experimental  data.  Yet,  in  scientific  endeavor,  exacting  conclusions  are 
often  drawn  by  the  comparison  of  an  idealized  hypothesis  and  real  case 
data.  That  is,  both  quantities  are  coupled  to  each  other  and  not  to  an 
independent  estimate  of  the  deviation  present, 

Empirical  models  of  the  shock  detachment  distance  for  blunt  bodies 
of  revolution  have  been  made  by  Serbin  [l]  ,  Ambrosio  and  Wortman  [2]  . 
and  Heberle,  Wood,  and  Gooderum  [4]  .  The  data  were  obtained  by  the-  r 
authors  using  moving  streams  of  air  surrounding  stationary  spheres 
(i.  e.  ,  radius  nosed  bodies)  in  such  experimental  devices  as  wind  tunnels 
and  jet  nozzles.  3oth  of  these  devices  have  two  common  disadvantages. 

The  gaseous  medium  is  in  a  state  of  expansion  jus*  prior  to  the  shock 
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compression.  Also,  holding  devices  are  present  in  the  flow  around  the 
body  which  cause  perturbations  in  the  flow.  The  flow  is  often  not  uniform 
in  cross  section.  The  measurements,  therefore,  include  these  perturba¬ 
tions  and  do  not  represent  the  real  case  of  a  vehicle  in  free  flight. 

Serbin  [l]  derived  the  following  relationship  for  a  sphere; 

(1)  |  =  2/3  (K  -  l)*1  , 

Ambrosio  and  Wortman^  derived  the  following  relationship; 

(2)  |  =  0.143e3- 

4 

andHeberle,  Wood,  and  Goode  rum  derived  this  relationship: 

(3)  |  =  4/3  (M  -  if 1/3  . 


24/M2 


Each  author  stated  that  agreement  between  the  model  and  the  data 
was  very  satisfactory.  However,  the  standard  by  which  this  was  deter¬ 
mine  was  not  stated  or  explained.  This  type  of  unexplained,  seemingly 
arbitrary,  acceptance  of  a  model  and  data  appeared  to  be  typical, 

A  machine  literature  search  was  made.  In  this  search,  over 
100,000  documents  were  screened  and  matched  by  computer  on  the  basis 
of  key  words  and  terms  in  aerodynamics  and  statistics.  This  was  done 
to  determine  if,  in  the  past,  any  use  of  statistics  in  testing  aerodynamic 
experimental  data  had  been  done.  Not  one  document  was  found  during 
the  search  However,  this  is  not  to  imply  that  statistics  have  not  been 
used.  Apparently,  it  is  either  not  a  prevalent  or  accepted  practice  or 
possibly  has  not  been  reported. 

Ambrosio  and  Wortman  [6]  did  attempt  the  use  of  some  simple 
statistical  methods.  This  was  done  to  the  extent  of  computing  the  mean, 
the  absolute  mean,  and  the  standard  deviation.  However,  it  was  not  for 
the  purpose  of  testing  the  reliability  of  their  data  and  model  but  to  objec¬ 
tively  establish  the  relative  worth  of  their  model  as  compared  to  Serbin[l]  . 
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This  work  has  two  objectives  as  follows; 

1)  To  develop  an  empirical  model  of  shock  oetachmcnt  distance  as 
a  tunction  of  Mach  number  and  vehicle  nose  radius  with  experimental 
data  obtained  under  free  flight  conditions 

2)  To  subject  this  model  and  data  to  analysis  by  statistical  methods 
to  objectively  define  the  level  of  confidence  of  such  a  model. 

H.  EXPERIMENTAL  PROCEDURES. 

I.  Design 

The  shock  detachment  distance  can  be  described  aerodynamically 
for  radius  nosed  bodies  of  revolution  as; 

(4)  A  =  f  (M,R). 

Explicit  models  of  several  investigators  were  mentioned  in  the  introduc¬ 
tion. 


Statistically,  the  model  can  be  expressed  as; 


(5) 


A  =  p.  +  M.  +  R .  r  MR, , 

i  J  ij  k(ij) 


The  model  contains  two  independent  factors,  Mach  number  (M.)  and  body 
radius  (Rj).  It  also  contains  a  second  order  effect,  the  MR^j  interaction. 


The  design  of  the  experiment  required  consideration  of  both  the  aero 
dynamic  and  the  statistical  aspects.  Past  experience  indicated  that  the 
shock  detachment  distance  was  a  nonlinear  function  of  Mach  number  (M) 
and  a  linear  function  of  radius  (R)  The  objectives  of  the  experiment  are 
to  determine  if  the  linear  and  quadratic  effects  of  Mach  number  and  ra¬ 
dius  contribute  significantly  to  the  shock  detachment  distance.  Also, 
it  was  desired  to  determine  if  a  second  order  or  interaction  effect  be¬ 
tween  radius  and  Mach  number  contributes  significantly  to  the  shock 
location.  The  analysis  of  variance  is  a  useful  tool  for  this.  In  addition, 
it  was  also  desired  to  develop  an  empirical  model  of  the  functional  re¬ 
lationships  between  the  independent  and  dependent  variables.  A  regres¬ 
sion  analysis  was  planned  for  this, 
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The  analysis  of  data  by  regression  calculation  can  be  simplified  by 
the  equal  spacing  of  the  independent  variables  which  permits  the  use  of 
orthogonal  polynomials.  This  helps  also  in  the  subsequent  adjustment 
arising  from  the  discarding  of  insignificant  variables  or  the  addition  of 
new  terms.  One  objective  of  the  experiment  is  to  estimate  the  elope  of 
the  regression.  The  slope  of  a  ’•egression  is  estimated  more  precisely 
if  the  values  of  the  independent  variables  are  selected  with  equal  spacing 
at  the  extremes  of  the  quantified  ranges  of  the  variable.  This  is  because 
interpolation  is  more  reliable  than  extrapolation  and  the  computations 
are  simplified. 

The  effects  of  the  main  factors  in  thiB  experiment  could  not  be 
considered  theoretically  independent.  Therefore,  it  is  necessary  to  rep¬ 
licate  the  experiment  within  cells  of  all  factor  levels  in  order  to  test 
for  interactions  between  factors  and  to  estimate  the  experimental  error. 
Since  one  objective  is  to  statistically  test  for  interaction,  the  analysis 
of  variance  will  enable  the  test  of  interaction  and  estimates  of  error 
variance.  The  two  best  tests  for  statistical  analysis  of  the  aerodynamic 
experiment  are  the  analysis  of  variance  and  the  multivariate  regreosion 
analysis.  The  experimental  design  most  efficient  for  these  methods  1b 
the  factorial  experiment  with  replication. 

The  factorial  experiment  enables  one  to  test  the  effects  of  Mach 
number  (M)  and  radius  (R)  on  the  shock  location  (A )  over  the  ranges  of 
interest  of  M  and  R  at  each  factor  level.  It  also  promotes  testing  for 
the  existence  of  interaction  between  M  and  R  and  the  effect  of  interac¬ 
tion  on  A  .  One  is  also  able  to  differentiate  interaction  effects  from 
main  effects.  In  addition,  it  allows  the  determination  of  confidence 
limits  for  the  estimates  of  main  and  interaction  effects  based  on  the 
estimate  of  experimental  error  derived  from  replication. 

2 

Therefore,  the  experiment  was  designed  as  a  fixed  model  3  fac¬ 
torial.  Boththe  radius  and  Mach  number  factors  are  equispaced  three 
level,  fixed  and  quantitative.  The  Mach  number  range  of  interest  was 
1. 0  to  4.  5.  The  levels  selected  were  M.  =  1.1  to  1.5,  =  2.5  to  2.9, 

and  M^  =  3.  9  to  4.  3.  The  radii  selected  were  nose  to  base  radius  ratios 
of  Rj  =  1.0,  R^  =  0.7,  and  R  =  0.4.  The  experiment  was  replicated 
three  times  in  each  factor  cell;  therefore,  a  total  of  27  observations 
was  recorded  (N  =  3x3x3  =  27). 


_  e  rr - i  ^ 


All  27  responses  could  not  be  obtained  in  1  oav.  Therefore,  to 
compensate  for  day-to-day  variations  in  personnel,  voltages,  develop¬ 
ing  solutions,  film  batches,  ana  printing,  the  firing  sequence  was 
randomized.  All  combinations  ofiactors  and  replicates  were  listed  and 
the  experimental  sequence  was  randomizer  by  use  of  a  random  number 
generator  [7]  which  was  entered  in  a  random  manner.  The  results  of  the 
randomization  are  shown  in  Table  I.  The  numbers  shown  without  par¬ 
entheses  are  the  sequence  of  firing  while  the  numbers  in  parentheses 
are  the  corresponding  round  identification  numbers.  Table  I  also  shows 
the  factor  levels  selected  for  the  experiment. 


Table  I.  Randomized  Experimental  Sequence 


Mach  Number  Levels  j 

Nose/Base 

Radius 

Ratio 

Replicate 

1.1 

Mi 

to  1.  5 

m2 

2.  5  to  2.  9 

m3 

3.  9  to 

4.  3 

1 

26 

(75) 

7 

(56) 

11 

(60) 

Rj  «1. 0 

2 

22 

.  (71) 

8 

(57) 

6 

(54) 

3 

2 

(49) 

14 

(63) 

10 

(59) 

1 

12 

(61) 

13 

(62) 

9 

(58) 

R2  =  0.7 

2 

23 

(72) 

27 

(76) 

25 

(44) 

3 

24 

(73) 

18 

(67) 

. 

15 

(64) 

1 

16 

(65) 

3 

(50) 

19 

(68) 

R  ^  =  0,  4 

2 

1 

(48) 

17 

(66) 

5 

(53) 

3 

4 

(52) 

20 

(69) 

21 

(70) 

Note  s; 

1,  Numbers  without  parentheses  are  randomly  determined 
program  firing  sequence. 


2.  Numbers  with  parentheses  are  for  experiment  identification. 
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The  radii  of  the  models  are  discrete  levels.  The  Mach  number 
levels  are  discrete  intervals  as  it  is  almost  impossible  to  duplicate 
exact  velocities  by  this  method  of  experiment.  This  is  due  to  variations 
in  propellants,  model  material  homogeneity,  and  model-launch  tube 
interference.  The  Mach  number  levels  chosen  were  fixed  in  selected 
ranges  between  Mach  1.  0  and  4.  5  which  is  the  velocity  regime  of  interest 
in  this  aerodynamic  study.  As  a  two  factor  fixed  model  experiment,  it 
is  assumed  that  p  is  a  fixed  constant  and  the  <k(ij)'s  are  norma^y 
and  independently  distributed  with  a  zero  mean. 

2.  Procedure 

The  experimental  data  were  obtained  on  the  Physical  Sciences 
Laboratory's  free  flight  aeroballistic  range.  Figure  1  shows  the  ex* 
perimental  apparatus.  It  consists  of  a  light  gas  gun  for  launching  the 
models,  and  altitude  simulation  chamber,  a  shadowgraph  and  a  Schlieren 
system  for  photographing  the  model  and  the  flow  around  the  model. 

Also,  submicrosecond  electronic  counters  to  determine  the  model's 
time  of  flight  are  included. 

The  aerodynamic  data  required  from  this  experiment  are  the  radius 
of  the  model,  the  Mach  number  of  the  model,  and  the  detachment  dis¬ 
tance  of  the  shock.  The  radius  of  each  model  was  known  as  the  models 
were  formed  in  accurately  machined  dies.  Their  geometries  axe  shown 
in  Figure  2.  The  models  were  made  of  copper  coated  lead.  The  Mach 
number,  is  determined  by  taking  the  ratio  of  the  model  velocity  to  the 
acoustic  velocity  when  the  photographs  are  made.  The  acoustic  velocity 
is  computed  as  shown  in  Appendix  A.  It  is  seen  that  the  acoustic  velo¬ 
city  varies  as  the  square  root  of  the  temperature  and  specific  heat 
ratio.  The  temperature  was  recorded  at  the  time  of  launching  each 
model.  The  specific  heat  ratio  was  taken  as  1.4.  The  model  velocity 
was  computed  by  taking  the  ratio  of  the  distance  between  the  shadow¬ 
graph  and  Schlieren  stations  to  the  time  recorded  on  the  counter.  The 
distance  between  the  shadowgraph  and  Schlieren  stations  is  a  constant 
of  5  feet.  It  was  assumed  that  the  deceleration  of  the  model  over  5  feet 
was  linear;  therefore,  the  velocity  computed  wao  the  velocity  of  the 
model  midpoint  between  the  two  stations. 

Photographs  of  the  model  showing  the  shock  detachment  distance 
were  taken  by  both  the  shadowgraph  and  Schlieren  systems  The  mea¬ 
sure  of  the  shock  detachment  distance  from  either  one  of  these  photos 


PH  AND  SCHLIEREN  LIGHT 
>0  NANOSECONDS  DURATION 


Figure  2.  Sketch  of  Experimental  Flight  Modele 
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would  not  coincide  with  the  velocity  of  the  model.  Therefore,  with  the 
assumption  of  linearity,  the  shock  detachment  distance  was  corrected 
to  the  velocity  computation.  The  correction  of  the  detachment  distance 
required  the  consideration  of  the  magnification  factors  for  the  photo¬ 
graphs.  The  magnification  factor  for  the  shadowgraph  camera  was 
1.009  and  the  Schlieren  camera  was  0,855.  The  photo  reader  upon 
which  the  negatives  were  read  was  calibrated  at  3502  electronic  counts 
per  inch  in  the  plane  of  the  negative  on  the  photo  reader.  The  shock 
detachment  distance  was  read  in  counts  from  both  the  shadowgraph  and 
Schlieren  negatives.  The  detachment  distance  and  radius  of  each  type 
model  was  corrected  to  counts  as  follows: 

(6) 

and 

(7)  R  =  2  x  C  x  R,  x  F  x  F  xR 

b  sh  sc  r 

The  values  of  A  and  R  computed  for  each  round  are  shown  in  Table  II. 
A  sketch  of  a  typical  shock  detachment  distance  as  taken  by  the  shadow¬ 
graph  and  Schlieren  is  shown  in  Figure  3. 

The  experimental  data  obtained  from  the  experimental  program 
are  compiled  in  Table  II.  The  data  are  tabulated  and  identified  by  the 
round  number  assigned  on  the  aeroballistic  range.  Computations  of 
certain  data  presented  in  Table  II  are  shown  in  Appendix. A,  The  data 
from  round  number  75  were  used  to  show  a  typical  example  of  the 
computational  procedures. 


A  =  6  F  +6 

sh  sc  sc  sh 


SHOCK  FRONT 


Figure  3.  Sketch  of  Typical  Shock  Detachment 


Table  H.  Compilation  of  Experimental  Data 
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111.  AN  ALYSES.  The  data  obtained  irom  ihc  experiment  cue  pro 
sented  in  Table  II.  The  observations  taken  as  the  dimensionless  ratio 
of  the  standoff  distance  divided  by  the  model  radius  are  presented  in  the 
factorial  design  layout  in  Table  ill  along  with  some  computations  in 
preparation  for  performing  an  analysis  of  variance.  The  statistical 
computations  are  presented  in  Appendix  D. 

The  gathering  of  the  data,  the  analysis,  and  derivation  of  the  model 
.of  the  functional  relationships  from  the  experimental  observations  are 
oased  on  certain  aerodynamic  and  statistical  assumptions.  These 
assumptions  are: 

1)  Small  angles  of  attack  of  the  models  (i.e.  ,  less  than  2“)  do  not 
significantly  effect  the  detachment  distance. 

2)  The  models  were  free  from  ablation  products  in  the  stagnation 
region. 

3)  The  effects  of  gas  constituent  dissociation  on  the  dynamics  of 
flow  was  insignificant. 

4)  The  effects  of  spin  stabilization  on  the  dynamics  of  flow  was 
insignificant. 

5)  The  effect  of  the  conical  section  of  two  of  the  models  or.  the 
dynamics  of  the  flow  was  insignificant  (i.e.  ,  all  projectiles  were  hem¬ 
ispheres  of  various  radii). 

6)  The  experimental  error  is  normally  and  independently  distrib¬ 
uted  . 

7)  The  experimental  precision  is  essentially  the  same  for  all 
factor  combinations. 

8)  The  factors  were  fixed  at  discrete  levels  so,  therefore,  are 
not  independent  of  each  other. 

Assumptions  1  through  5  are  made  concerning  the  aerodynamics  of 
the  experiment,  These  represent  sources  of  variation  which  are  con¬ 
sidered  negligible.  They  cannot  be  separated  explicitly  from  the  main 
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and  interaction  effects.  It  is  important  to  note  that,  even  though  con¬ 
sidered  negligible,  these  variations  are  present  and  are  statistically 
accounted  for  by  summation  into  experimental  error.  The  statistical 
assumptions  6  through  8  allude  to  these  conditions. 

1.  Analysis  of  Variance 

The  experiment  was  described  in  Section  II  by  the  statistical 

model 


(8) 


p  +  M,  + 

1 


Rj  +  MRij  +  *k(ij)’ 


The  theoretical  model  underlying  the  analysis  of  variance  assumes  that 
each  experimental  response  of  the  shock  detachment  distance  (A)  is  the 
algebraic  sum  of: 

1)  An  overall  mean  of  the  detachment  distance,  p  (i.e.  true 
standoff  distance)  ,  , 


2)  A  Mach  number  effect  on  the  standoff  distance, 

3)  A  radius  effect  on  the  standoff  distance,  R^ 

4)  An  interaction  effect  on  the  standoff  distance,  MR^ 

5)  A  random  residual  error  (experimental) , 


Since  the  model  is  a  fixed  model,  none  of  the  effects  can  be  measured 
absolutely.  They  can  be  measured  only  as  differential  deviations,  i.e.  , 
the  as  deviations  from  p  ,  the  Rj  as  deviations  from  p,  and  the  MRy 
as  deviations  from  M^  +  Rj. 


The  results  of  the  analysis  of  variance  are  shown  in  Table  IV.  The 
computations  are  presented  in  Appendix  B, 


From  Table  IV,  it  can  be  seen  that  the  main  effects  of  radius  have 
apparently  no  significant  effect  on  the  shock  detachment  distance  at  the 
95  percent  level  of  confidence.  The  linear  and  quadratic  effects  are 
also  insignificant.  The  quadratic  effects  of  radius  seem  to  have  the 
most  effect  on  the  standoff  distance.  They  would  be  significant  at  the 
80  percent  level  of  confidence  though  still  not  significant  at  the  95  per¬ 
cent  level. 


The  Mach  number  in  significant  at  the  95  percent  level  of  confidence. 
The  computed  value  in  the  F  test  is  greater  than  the  F  distribution 
table  value  by  a  factor  of  about  5.  The  linear  and  quadratic  effects  are 
also  significant.  The  linear  effect  of  the  Mach  number  factor  was 
found  to  be  more  significant  than  the  quadratic  effect. 

s  .The  analysis  of  variance  also  shows  that  there  io  apparently  no 
significant  effect  of  the  MR^j  interaction  on  the  standoff  distance.  It  iB 
interesting  to  note,  however,  that  of  all  the  combinations  of  linear  and 
quadratic  interactions  between  Mach  numbers  and  radi  s,  the  quadratic 
radius  and  linear  Mach  number  were  most  nearly  significant  at  the  95 
percent  lev-1  of  confidence.  This  is  congruent  with  the  fact  that  the 
test  of  the  quadratic  effects  of  radius  and  the  linear  effects  of  Mach 
number  was  highest  in  the  main  effects  tests.  Under  the  interaction 
effects  tests,  the  computed  value  of  3.  ’59  for  the  RqMj  combination 
would  be  significant  at  the  92  percent  level  as  compared  to  4.41  for  the 
F  value  at  the  95  percent  level. 

It  is  also  noted  in  Table  IV  that  the  mean  square  fer  radius  and 
radius-Mach  number  interactions  were  only  slightly  higher  than  the 
mean  square  for  error.  On  the  basis  of  the  assumption  that  the  exper¬ 
imental  error  is  normally  distributed  between  all  factors  and  all  levels, 
then  radius  and  interaction  effects  do  not  significantly  contribute  to 
shock  detachment  distance  within  the  limits  of  this  experiment. 

The  results  of  the  analysis  of  variance,  as  shown  in  Table  IV,  ia 
further  analyzed  as  shown  in  Figure  4.  Figure  4  it>  the  graphic  display 
of  the  results  of  the  Duncan  range  tests  as  computed  in  Appendix  B. 
Figure4(a),  for  the  Mach  number  range  significance  test,  shows  that 
the  Mi  level  (1.1  to  1.  5)  is  significantly  different  from  the  Mj  and  M3 
levels  of  2.  5  to  2.  9  and  3.  9  to  4.  3,  respectively.  The  M2  and  M3 
levels  were  not  found  to  be  significantly  different  from  each  ether.  The 
radius  factor  ran^e  test  as  shown  in  Figure  4(b)  shows  the  radius  factor 
levels  not  significantly  different  from  each  other.  The  fact  that  the 
and  M3  levels  are  not  significantly  different  from  each  other  will  be 
discussed  later  in  this  section, 

2.  Regression  Analysis 

The  analysis  of  variance  can  be  performed  whether  the  factors 
are  quantitative  or  qualitative.  Vhen  the  factors  are  quantitative,  then 
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a  r>nn>aiinn  analvai*  ran  K»  narfnrmpd  nn  the  data.  This  analysis  is 

■'O*  —  -  *  —  ■■•■/••  * 

especially  useful  in  the  determination  ot  the  general  functional  relation- 
thips  of  the  factor*  at  other  than  the  experimentally  assigned  levelo. 

The  analysis  of  variance  has  led  to  knowledge  of  the  important  factor 
considered  in  this  experiment  which  contributes  to  the  shock  detach¬ 
ment  distance.  This  was  found  to  be  the  linear  and  quadratic  effects  of 
Ma'-h  number.  This  led  to  a  bivariate  regression  analysis.  The  regres¬ 
sion  analysis  used  was  the  SNAP  Multiple  Regression  Analysis  for  the 
IBM  7090  computer.  It  v/as  the  Army  Missile  Command  SHARE  183 
program. 

As  pointed  out,  it  is  realized  that  the  shock  detachment  distance 
is  not  singularly  a  function  of  Mach  number.  There  are  other  factors 
which  were  not  included  in  this  experiment.  For  the  factors  considered 
by  the  analysis  of  variance,  some  knowledge  of  the  main  significant 
factor  (Mach  number)  is  now  available. 

Before  progressing  with  the  regression  analysis,  the  physical 
aspects  of  the  shock  detachment  distance  must  be  considered.  The 
functional  relationship  must  be  consistent  with  the  aerodynamic  concepts 
of  the  detachment  distance.  The  detachment  distance  is  inversely  propor¬ 
tional  to  Mach  number.  Th4t  is: 

(9)  a  *  Tiay  *  1  (v>  • 

The  limits  of  tha  functional  relationships  are  thon 

lim  f  (~)  a  lixn  =  lim  A  *  « 

a  -*  o  M  -*o 

lim  f  (~-)  =  lim  “  lim  A  *  0 

a  —  w  M  **oo 

lim  f  (~)  ~  lim  ^~y  =  lim  A  =  0 

V  -*  o  M  -►06 
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(10) 


lim  f  (^) 

=  limIKT)' 

V  -« 

M  -o 

lim  f  (£) 

=  limf(W 

V  —a 

M  -1 

=  lim  A  =  oo 


lim  A  =  constant. 


The  functional  relationship  as  determined  by  the  regression  analysis 
should  be  compatible  with  these  bounds  and  pass  the  limit  tests. 

The  computer  program  is  a  linear  multiple  regression  anal/sis. 
However,  the  analysis  of  variance  indicated  that  the  linear  and  quadratic 
effects  of  Mach  number  are  significant.  Therefore,  a  transformation 
was  required  to  make  the  computer  program  applicable  to  the  hypothe¬ 
sized  relationship.  The  relationship  is  hypothesized  as 

(11)  A  =  AMbMC. 


A  physical  limitation  of  the  functional  aspect  of  A  is  that 


(12) 


A  +  R 

n 


>  i 


because  as  the  free  Btream  Mach  number  goes  to  infinity,  the  shock  is 
no  longer  detached  but  attached  and  the  standoff  distance  is  zero. 
Therefore,  the  desired  functional  form  of  the  equation  is 

(13)  =  AMbMC, 

n 

which  presents  the  detachment  distance  as  a  dimensionless  ratio,  which 
is  a  more  usable  form  for  design  engineering  purposes. 

This  is  not  to  indicate  the  dependence  of  detachment  distance  on 
body  nose  radius  but  to  account  for  differences  in  body  geometry.  That 
ie,  the  equations  of  detachment  distance  for  bodies  with  radius  noses 
cannot  be  used  for  sharp  pointed  bodies  such  aB  cones  or  purely  blunt 
bodies  such  as  right  circular  cylinders.  Therefore,  this  functional 
relationship  is  for  a  geometric  class  of  bodies,  i.  e.  ,  radius  noBed  bodie 
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Equation  (11)  was  programmed  for  the  regression  analysis  by  using 
the  natural  logarithm  transformation.  The  equation  programmed  was 


(14) 


In  A  +  b  In  M  +  c  In  M. 


in  computer  language,  the  equation  was 

(15)  In  Y  =  In  A  +  b  In  X  +  c  In  X£  . 


The  values  of  A/R  and  M  were  taken  from  Table  II  and  programmed 
into  the  computer,  where 


(16) 


Xj  a  M 
X2  =  M2  . 


The  computer  transformed  the  experimental  data  to  the  natural  loga-' 
rithm  form. 


The  results  of  the  computer  regression  analysis  are  shown  in 
Table  V.  The  computer  made  two  runs.  After  the  first  run,  the  results 
are  automatically  tested  for  significance  (a  =  0.05)  and  the  insignificant 
variables  are  dropped,  It  can  be  seen  that  the  Xg  term  was  dropped  by 
the  computer.  The  data  for  run  2  were  taken  as  the  final  regression 
analysis  values.  The  pure  constant  (A),  t!he  first  coefficient  (b),  and 
the  regression  coefficient  (r)  were  tested  and  found  significant  as  shown 
in  Table  V  and  Table  VI.  The  regression  equation  is  therefore: 


(H) 


In  Y  =  In  A  +  b  In  X^ 

In  Y  =  In  0.  7512  -  1.  911  In  Xl  . 


Taking  the  antilog  the  equation  becomes 
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(18) 


Y  =*  2.12  X, 


-1.911 


Y 


2.12 

xi.9n 


or 


(19) 


A  2.12 

R  "  (M)1'911 


with  a  standard  error  of  estimate  of  0.  3933. 


3.  Testing  the  Model 


Through  the  use  of  the  analysis  of  variance,  the  effect  of  Mach 
number  on  the  detachment  distance  was  determined  to  be  significant 
both  linearly  and  quadratic  ally.  Based  on  this,  a  regression  analysis 
was  used  to  derive  a  general  mathematical  relationship  between  detach¬ 
ment  distance  and  Mach  number.  Certain  physical  limits  were  pre¬ 
scribed  for  the  form  of  the  equation.  These  physical  limits  are  tested 
as  follows: 


(20) 


'if  M 


2.12 


(O) 


1.  911 


2 ;  12 

0 


Test  of  Significance  of  Regression  Coefficients  A,  b  hypothesis  A  =  0 


t(|  =  0.025,  df  =  25)  =  +2.06 


b  =  0 


t=  =  10.  002  >2.  06  Test  significant,  reject 

0.  39033/7  27  hypothesis 

t  *  =  13-25  >  2,06  Test  significant,  reject 

'  hypothesis 
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Table  V.  Compilation  of  Regression  Analysis  Data 


Model:  InY  =  InA 

+  blnX1 

+  clnX2 

- 1 

Type  of  Data 

Run  1 

Run  2 

Pure  Constant 

tA) 

0.748900 

0.  751177 

First  Coefficient 

0») 

-27.  610352 

-1.  910723 

Second  Coefficient 

(c) 

1Z.  842773 

(dropped) 

Standard  Deviation  Y  from  Mean 

1.  084638 

1.  084638 

Coefficient  of  Determination 

(r2) 

0.  878570 

0.  87  5469 

Multiple  Correlation  Coefficient 

(r) 

-0.  937321 

-0, 935665 

Variance 

7  1.22 

0.154759 

0.  152363 

Standard  Error  of  Estimate 

*  1.2 

0.  393394 

0.  390337 

Standard  Deviation  of  First 
Coefficient 

b 

31.  500086 

0. 144127 

Standard  Deviation  of  Second 
Coefficient 

c 

15.740889 

(dropped) 

T  Value  for  Coefficient  Check  after 
First  Run  (a  ■  0.  05) 

2.  60 

Teat  of  Significance  of  Simple  Correlation  Coefficient  r 
hypothesis  r  *  0 


t 


0.935665.0 

0.152363 


■  6.14  >  2, 06 


Test  significant,  reject 
hypothesis 


(21) 


if  M  »  1  j  ^  ■ 


(1) 

a  2.12 


2.12 

mr 
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(ZZ) 


if  M 


Z.  12 
.1.  911 


2.12 


=  0  . 


Therefore,  the  regression  equation  has  the  correct  form  for  the  physi¬ 
cal  limitations.  Since  Mach  number  is  dimensionless,  the  inclusion  of 
R  gives  dimension  to  A  .  R  is  not  tested  for  limits  of  0  and  «  ,  as 
R  =  0  implies  a  pointed  body  and  R  =  »  a  flat  plate, 


Table  VI.  Compilation  of  Test  Hypotheses 


Hypo- 
the  sis 

df 

F  requency 
Distribution 

a 

Type 

Test 

Significant 

Hypo¬ 

thesis 

R 

=  0 

2,  18 

F 

0.05 

1  Tail 

No 

Accept 

M 

=  0 

2,  18 

F 

0.  05 

1  Tail 

Yes 

Reject 

MR 

=  0 

4,  18 

F 

0.  05 

1  Tall 

No 

Accept 

X 

e 

=  X 

r 

26 

t 

0.  05 

2  Tail 

No 

Accept 

62 

e 

=  cr2 
r 

26 

X2 

0,  05 

2  Tail 

Yes 

Reject 

Xr 

=  ^aw 

Z 

0.  05 

2  Tail 

No 

Accept 

i 

=  (T2 
aw 

26 

X2 

0.  05 

2  Tail 

No 

Accept 

9 

- 

=  0 

25 

t 

0.  05 

2  Tail 

Yes 

Reject 

b 

=  0 

25 

t 

0,  05 

2  Tail 

Yes 

Reject 

r 

=  0 

25 

t 

0.  05 

2  Tail 

Yes 

Reject 
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Next,  the  regression  model  was  statistically  tested  against  the 
experimental  data  and  the  Ambroslo-Wortman  model  mentioned  in  Sec¬ 
tion  II.  These  computations  are  shown  in  Appendix  B.  The  means  and 
variances  for  the  experimental  data,  the  regression  model,  and  the 
Ambrosio-Wortman  model  were  computed  based  on  responses  computed 
for  the  experimental  Mach  numbers.  Table  VI  shows  a  compilation  of 
the  hypotheses  for  testing  the  regression  model  means  and  variances. 
Table  VII  shows  the  computed  95  percent  confidence  limits  of  the  means 
for  the  experiment,  the  regression  model,  and  the  Ambrosio-Wortman 
model.  The  hypothesis  that  there  is  no  difference  between  the  variance 
as  experimentally  determined  and  as  determined  by  the  regression 
model  is  the  only  hypotheses  rejected.  The  hypothesis  that  there  is  no 
significant  difference  between  the  experimental  mean  and  the  regression 
model  mean  or  between  the  regression  model  mean  and  the  Ambrosio- 
Wortman  model  mean  are  accepted.  The  test  of  no  significant  difference 
between  the  regression  model  variance  and  the  Ambrosio-Wortman  model 
variance  is  also  accepted, 


Table  VII.  Compilation  of  95  Percent  Confidence 
Limits  on  Means 


Type  Mean 

Mean  A/R 

Increment 

Limits 

Experiment 

0.  793 

+  0.451 

1.  244  to  0.  342 

Regression  Model 

0.726 

+  0.249 

0..  975  to  0.  477 

Ambrosio-Wortman 

0.687 

+  0.293 

0.  981  to  0.  395 

The  computation  for  the  95  percent  confidence  limits  for  the  experi¬ 
mental  responses,  the  regression  model,  and  the  Ambrosio-Wortman 
model  are  shown  in  Table  Vll.  The  regression  model  has  the  narrowest 
range  of  values  for  this  level  of  confidence,  However,  the  'X^  test  of 
the  difference  between  the  variances  (the  second  statistical  moment)  Is 
not  significant  nor  is  the  difference  In  their  means  (the  first  statistical 
moment).  Therefore,  even  though  the  limits  of  the  regression  model  are 
narrower  than  the  Ambrosio-Wortman  model,  they  are  not  significantly 
different. 
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of  tiie  regression  model  and  the  experimental  responses  is  indicative  of 
the  insight  into  the  functional  relationship  between  detachment  distance 
and  Mach  number  obtained  by  the  analysis  of  variance  performed  prior 
to  the  regression  analysis.  The  lit  of  the  equation  by  the  method  of 
least  squares  is  approaching  the  true  mean  as  evidenced  by  the  high  and 
significant  correlation  coefficient  (r)  of  0.  94  (Table  V). 

In  order  to  determine  the  power  of  the  tests  between  the  means  of 
the  two  models  (regression  model  and  Ambro Bio  - Wortman  model),  an 
operating  characteristics  curve  was  computed.  The  calculations  are 
in  Appendix  B  and  the  plotted  values  are  shown  in  Figure  5.  From  this 
plot,  the  probabilities  of  an  acceptance  of  the  hypothesis  when  it  is 
actually  false  (type  II  error)  can  be  determined  for  selected  differences 
in  the  means  of  the  two  models,  For  example,  the  probability  of  accept 
ance  when  the  difference  between  Xr  and  Xaw  is  +0.  30  ia  about  65  per¬ 
cent,  and  the  probability  of  rejecting  the  hypothesis  is  35  percent, 

Plots  of  the  values  of  d/R  computed  for  Mach  numbers  from  1  to  8 
for  the  regrecsion  model  and  the  Ambrosio-Wortman  model  are  shown 
in  Figure  6,  The  locus  of  the  points  for  the  regression  model  and  the 
Ambrosio-Wortman  model  are  shown  for  comparison.  There  is  a 
region  of  high  curvature  or  nonlinearity  between  Mach  1.  5  and  about 
Mach  2.  5  with  the  curves  becoming  asymptotic  beyond  2.  5.  The 
Ambrosio-Wortman  model  becomes  asymptotic  to  a  a/R  value  of  0.143, 
whereas  the  regression  model  has  a  zero  asymptote,  the  ultimate  physi¬ 
cal  limit.  As  mentioned  earlier  in  this  section,  the  Duncan  range  test 
indicated  that  the  level  was  significantly  different  from  the  M;>  and 
level.  Figure  6  shows  the  curve  becoming  essentially  asymptotic 
at  about  Mach  2.5  or  at  about  the  beginning  of  the  M2  factor  level. 


□  REGRESSION  MODEL 
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IV.  SUMMARY .  This  experimental  and  analytical  exercise  has  led 
to  the  development  of  a  mathematical  model  of  shock  detachment  distance. 
This  model  has  been  statistically  tested  for  significance  on  the  basis  of 
comparison  with  several  universal  frequency  distributions.  The  hypo¬ 
theses  made  and  tested  are  compiled  in  Table  VI. 

The  hypothesis  that  the  radius  has  no  effect  on  the  detachment  dis¬ 
tance  was  accepted.  This  does  not  mean  that  radius  has  no  effect  on  the 
shack  detachment  distance  but  that,  within  the  limits  of  the  tests,  a 
significant  effect  cannot  be  detected.  That  's,  one  cannot  reject  the 
hypothesis. 

The  hypothesis  that  the  Mach  number  has  no  effect  cn  the  detachment 
distance  was  rejected.  Mach  number  is  apparently  a  significant  contrib¬ 
utor  to  shock  location.  This  means  that  within  the  limits  of  the  test  a 
significant  variance  associated  with  Mach  number  is  detectable  and  can¬ 
not  be  attributed  to  experimental  error. 

The  hypothesis  that  the  MR  interaction  has  no  effect  on  detachment 
distance  was  also  accepted.  This  hypothesis  is  accepted  fns  similar 
reasons  as  the  hypothesis  on  radius  effects.  From  Table  IV,  the 
ANOVA  table,  it  can  be  seen  that  the  radius  effect  accounts  for  only 
1.  65  percent  of  the  total  expected  mean  square  of  the  experiment.  Mach 
numoer  accounts  for  59.  25  percent,  MR  interaction  accounts  for  3.  30 
percent,  and  error  accounts  for  35.  80  percent.  It  is  pointed  out  that  the 
variance  attributable  to  variables  not  included  in  the  experiment  could 
be  summed  in  the  Mach  number  factor,  which  if  separated  would  reduce 
the  detectable  effects  of  Mach  number.  For  example,  body  surface 
roughness,  free  Btream  density,  and  humidity,  possible  sources  not 
included  in  the  experiment,  may  significantly  effect  shock  location. 

The  hypothesis  on  the  derived  regression  constants,  coefficient, 
correlation  coefficient  were  all  rejected,  This  implies  that  these  values 
were  significantly  different  from  the  values  one  would  derive  from  data 
where  there  was  no  correlation  between  the  variables  included  in  the 
analysis.  The  standard  error  oi  estimate  of  0.  390337  shows  that  the 
fit  for  the  universe  line  of  regression  is  good  but  not  perfect.  For  a 
perfect  fit,  the  standard  error  of  estima;e  would  be  zero  and  the.  cor¬ 
relation  coefficient  I.  0  instead  of  0,  935665.  This  emphasizes  the  fact 
that  all  variables  which  affect  the  shock  location  are  not  included  and  all 
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variances  present  have  not  been  accounted  for.  However,  the  model  does 
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that  is  "explained"  by  the  independent  variable  (M). 

The  mean  of  the  experimental  data  was  not  found  to  be  significantly 
different  from  the  mean  ox  the  regression  model,  whereas  the  variances 
were  significantly  different.  However,  since  the  variance  teat  is  a 
more  sensitive  test  (i.e.  ,  the  second  statistical  moments  as  compared 
to  the  first  statistical  moment),  it  is  believed  that  this  also  attributes 
to  the  reliability  of  the  model.  The  mean  of  the  regression  model  was 
not  found  to  be  significantly  different  from  the  mean  of  the  Ambrosio- 
Wortmar,  model.  This  was  also  true  for  the  variances  of  the  two 
models.  This  indicates  that  within  the  limits  of  this  investigation  there 
is  no  significant  difference  between  the  model  derived  from  wind  tunnel 
data  and  free  flight  data.  That  is,  the  hypothesis  that  the  perturbations 
of  holding  devices  and  expanding  flow  in  wind  tunnel  terts  increase  the 
variance  of  main  effects  or  experimental  effects  cannot  be  detected. 

This  is  not  to  say  that  they  do  not.  It  is  indicated  in  Table  VII  that  the 
regression  model  is  to  some  degree  more  accurate  than  the  Ambrosio- 
Wortman  model  as  the  95  percent  confidence  limits  on  the  means  are 
more  narrow  but  not  significantly  so. 

Therefore,  within  the  limits  of  the  aerodynamic  and  statistical 
assumptions  of  this  investigation,  the  following  general  observations 
are  made:  1 

1)  The  model  derived  is  reliable  model  for  the  prediction  of 
shock  detachment  distance  as  a  function  of  Mach  number. 

2)  The  model  derived  with  free  flight  data  is  apparently  not  signif¬ 
icantly  better  than  models  derived  by  data  from  wind  tunnelB. 

3)  The  use  of  the  statistical  methods  for  the  analysis  of  data  can 
lead  to  increased  knowledge  of  the  functional  relationships  of  physical 
variables. 

4)  The  inferences  that  can  be  made  through  the  analysis  of  data 
by  statistical  methods  are  more  objective  inferences  than  could  other¬ 
wise  be  made. 
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5)  The  use  of  statistics  is  an  extremely  useful  tool  for  the  analysis 
of  data  which  are  functions  of  physical  relationships  and  in  many  cases 
lead  to  increased  confidence  in  the  results  of  the  analysis  over  mere 
visual  inspection  of  experimental  responses. 

V.  SUGGESTED  FUTURE  STUDIES.  The  results  of  this  study 
indicate  that  the  shock  detachment  distance  for  radius  nosed  bodies  is 
strongly  a  function  of  Mach  number  between  1.  0  and  about  2.  5.  After 
2.  5,  the  detachment  distance  is  practically  independent  of  Mach  number. 
This  was  established  by  the  Duncan  range  test  which  shows  that  there  is 
apparently  ho  significant  difference  between  the  responses  obtained  at  the 
Mj  (2,  5  to  2.  9)  and  the  M3  level  (3.  9  to  4.  3).  Therefore,  it  deems 
appropriate  to  perform  future  studies  in  the  Mach  range  of  1. 0  to  2.  5 
to  obtain  a  better  understanding  of  the  function  where  the  variation  is 
most  sensitive.  This  will  provide  a  better  estimate  of  the  universe 
regression  line  of  the  shock  detachment  distance  in  this  velocity  range. 

Another  important  point  to  consider  for  future  experimental  studies 
is  to  confound  the  daily  variation  with  a  selected  interaction,  since  this 
study  shows  that  there  is  apparently  no  significant  effect  of  interaction 
on  the,  shock  detachment  distance.  In  this  study,  the  d?y  effect  was 
confounded  with  the  experimental  error  and  main  effects  through  ran¬ 
domization  of  all  factor  levels  and  combinations  with  days,  Another 
approach  would  be  through  design,  to  confound  a  priori  the  day  effects 
with  the  interaction.  This  would  separate  the  variance  due  to  day 
effects  from  the  experimental  error  and  main  effects  and  may  result 
in  a  more  sensitive  test  for  main  effects.  However,  this  does  not  nec¬ 
essarily  follow  because  the  degrees  of  freedom  for  error  would  be 
reduced  for  the  same  number  of  responses.  If  the  day  effects  are  not 
large,  the  separation  of  the  day  effects  may  not  be  sufficient  to  offset 
the  reduction  in  error  degrees  of  freedom.  This  would  require  judg¬ 
ment  in  future  designs.  In  this  study,  it  is  believed  that  it  was  advan¬ 
tageous  to  randomly  distribute  the  day  effects  rather  than  confounding 
them  with  the  main  or  secondary  effects  since  one  objective  was  to  test 
for  significance  of  interaction. 

The  very  high  significance  of  the  Mach  number  factor  indicated 
that  further  test  should  be  initiated  to  include  other  factors  as  free 
stream  density  and  some  discrete  levels  of  body  surface  roughness 
(deneity  and  body  surface  roughness  effects  were  summed  as  experi¬ 
mental  error  in  this  stu-'y). 
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A  suggested  experiment  of  academic  interest  would  be  a  4 ^  factor¬ 
ial  wit’u  day  effects  confounded  with  th»  1iigVif»sr  order  interaction.  The 
three  factor,  four  level  experiment  is  suggested  in  order  to  test  for 
one  degree  higher  order  (cubic)  effects.  Models  of  constant  radius,  but 
with  four  levels  of  surface  roughness,  at  four  levels  of  free  stream 
density  and  four  levels  of  velocity  would  be  flown  in  free  flight. 

This  experiment  would  enable,  through  the  analysis  of  variance  the 
determination  of  cubic,  surface  roughness  (£)  and  density  (p)  effects  in 
addition  to  velocity  effects.  Since  the  f3 4 5 6 7rst  order  interaction  in  this 
3tudy  (MR)jj  was  not  significant,  the  day  effects  coula  be  confounded 
with  the  second  order  interaction  (MSp)^^. 
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Appendix  A 

EXPERIMENTAL  COMPUTATIONS 


Sonic  velocity  was  computed  for  each  round  from  the  following 
equation: 

(A-I)  a  =  V y  RqT. 


Model  velocity  was  computed  for  each  round  from  the  following 
equation: 


(A*2) 


V  = 


5  feet 
t 


Mach  number  was  computed  for  each  round  from  the  following 
equation; 

(A- 3)  M  *  |  • 

The  magnification  factors  for  the  shadowgraph  (Fa^)  and  Schlieren 
(Fsc)  systems  were  computed  for  ail  rounds  from  the  following  equation: 

8 

2  Film  Model  Diameter/N 

.( A-4)  F  ,  and  F  =  - - - . 

'  '  sh  sc  8 

2  Model  Diameter/N 

N=1 


The  computed  values  are: 


F 


sh 


0.  226 
"  0.  224 


1.009 


(A-  5) 


F  °‘W5 

sh  "  0. 224 


0.855. 


Shock  detachment  distance  and  model  radius  correcting  for  mag¬ 
nification  and  location  was  computed  as  follows; 
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sc 


sn 


(A-6) 


A 

R 


sc 


sh 


but 


(A-7) 


*--» _ v  *  c  x  Rbx  F  xR 


sc(counts) 


sc  r 


R  .vsCxR.xF.xR  , 

sh(  counts)  b  sh  r 


therefore 


sc 


sc 


( A-8) 


—  (counts)  a 


CxRbxF.cxRr  C  x8bxI'-xR 


sh  r 


6  F  .  +  6  .  F 

sc  sh  sh  sc 

2(C  x  R,  x  F  x  F  .  x  R  )  * 
'  b  sc  sh  t ' 


Therefore. 

(A-9) 


A  (counts  corrected)  ■  &  t(? ^  ®ehFse 


and 

(A-9) 


R(counts  corrected)  ■  2(C  xR.  x  F  xF.xR). 

d  tc  sn  r 


Example  computations  for  round  75  as  shown  in  Table  II. 
a  a  VI. 4  x  1715  x  (460  +  71) 


1131 


V  a 


5  ft 


0.  003531  sec* 


1416 


♦  This  value  for  round  75  and  all  other  rounds  obtained  from  submicro' 
second  electronic  counters  as  recorded  in  aeroballistic  data  log. 
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STATISTICAL  COMPUTATIONS 
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l.  Analysis  of  Variance 

The  computations  for  the  analysis  of  variance  was  made  from 
the  data  shown  in  Tableau. 

Sums  of  squares  are  listed  below. 

Total  sum  of  squares 


SS 

t 


abr  , 
ZSL  X 
ijk 


(SX..)2 

vab 


(B-l) 


=  50,746 


(21.  4I4)2 

3.  3-  3 


=  33.7628. 


Sum  of  squares  due  to  radius 
b  2 

SX,  _Y  2 

SS  =  — i - 
R  jra  rab 

(4 . 8l0)a  +  (10.135)2  +  (6.469)2  (21. 4H)2 

(B-2)  =  -  9  -  zl- 

~  18. 6335  -  16.  9836 
=  1.6499. 

Sum  of  squares  due  to  Mach  number 


5oO 
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rxi2  2 

SS  -  _ 1-  -  — -'V  • 

M  irb  rat 

(17.700)2  +  (0.481)2  +  (0.502)2  (21.414)2 

(B-3)  =  9  "27 

=  35.  5819  -  16.  9836 

=  18.  5983. 


Sum  of  squares  due  to  MR  interaction 


(B-4) 


2  2 

22  x  ...  n„  2  X.  .  X  .  . 

SS  .  =  — .  -  .  — j—  -  .  — 3  . rr- 

MR  r  i  rb  j  ra  rab 

(3.87)2  +  (0.481)2  +  (0.502)2  + 

(8.860)2  +  (0.799)2  +  (0.  56G)2  + 

( 5.  Oi 3)2  +  (0.826)2  -I-  (9.630)2  -  1.6499  -  18.5989  -  16.9836 
3 

=  2.9885. 


Sum  of  squares  due  to  error 


SS  =  SS^  -  ssc  -  ssw  -  sswn 

<  t  R  M  MR 

=  33. 7628  -  1. 6499  -  18. 5983  -  2. 9885 
=  10.  5261. 


Sum  of  squares  due  to  linear  and  quadratic  effects  within  main  and 
interaction  effects.  (Coefficients  of  orthogonal  polynomials)* 


*C,  R.  Hicks,  Fundamental  Concepts  in  the  Design  of  Expeiimente, 
New  York,  New  York,  Holt,  Rinehart  and  Winston,  19e>4 
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Error  mean  square  =  0.  5847  with  18  d.  f. 
Standard  error  of  mean  is 


(B-8)  Sj. 

F rom  T able  e/  (a 
(B-9) 


V  Error  MS 
No.  of  Obs. 


0.  5847 
9 


0.2545  . 


0.05  =  18)  the  significant  ranges  are 

p  =  2  3 

ranges  =  2.  97  3.  12  . 


Multiplying  p  values  by  ,  the  least  significant  range b  are 


(B-10) 


LSR  =  0.  756  0.  796  . 


Largest  versus  smallest: 


(B-U)  1.967  -  0.224  =>  1.  743  >  0. 796*( significant) 

Largest  versus  second  smallest: 

(B-12)  1.967  -  0.189  *  1.778  >  0. 7 56*  (significant) 

Second  largest  versus  smallest; 

(B-13)  0.224  -  0.189  =  0.035  <  0.756 

(See  Figure  4  for  display  of  results), 
b.  Radius  Effects 


( B -14 ) 


treatments 


1.  126 
2 


0.  719  0,  534 

3  1 


Standard  error  of  mean  is 
*Hicks,  loc  .  c it. 
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(3-15) 


SX. 


=  V 


Error  MS 
No.  of  Obs. 


-r- 


5847 


=  0. 2545 


From  Table  E,  (a  =  0.  05  =  18)  the  significant  ranges  are 


(B-16) 


P  =  _J_  l 
ranges  =  2.  97  3.12. 


Multiplying  p  values  by  S—  ,  the  least  significant  ranges  are 


(B  -17) 


LSR  =  0.  756  0. 796  . 


Largest  versus  smallest: 

1.126  -  0. 5344  =  0.  5916  <  0.  796 
Largest  versus  second  smallest: 

1.126  -  0.  719  *  0.407  <  0.  756  . 


Second  largest  versus  smallest; 

0. 719  -  0.  534  =  0.184  <  0,  756  . 

(See  Figure  4  for  display  of  results). 

3,  Computations  for  Testing  the  Model 

a,  Computation  of  Experiment  Mean  and  Variance 


1 


Hicks,  loc. 


cit. 
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X 

X 

(X.  -  X  )2 
l  e' 

X, 

X 

(X.  -  X  )2 
'  l  e 

1 

_ e_ 

i 

e 

1.  049 

0.  793 

0.0655 

0.188 

0.  3660 

1.  492 

0.4886 

0.  253 

0.  2916 

1. 286 

0.  2430 

0.182 

0.  3732 

0. 146 

0.4186 

0.179 

0.  3769 

0.146 

0.4186 

0.  205 

0.  3457 

0.189 

0.  3648 

1.  034 

0.  0580 

0.  223 

0. 3249 

2.  243 

2. 1025 

0.139 

0.4277 

1.  736 

0.  8892 

0. 140 

0.  4264 

0.  226 

0.  3226 

1.461 

0.  4462 

0.  280 

0. 2631 

5.  478 

21.  9492 

0.  321 

0.  2227 

1. 921 

1.  2723 

0.  203 

0.  3481 

0.268 

0.  2756 

0.  210 

0.  3398 

0.  217 

2  21.  414 

0.  3317 

Xe  =  21.414/27  =  0.793 
S2  =  33.752/27-1  =  1.298 


=  V"1- 298  =  1.139 

b.  Computation  of  Regression  Model  Mean  and  Variance 

X. 

i 

X 
_ r 

(X.  -  X  )2 
i  r 

2Li 

X  (X.  -  X  )2 

r  i  r 

1.  382 

0.726 

0.  4303 

0.  314 

0.1697 

1.  618 

0.  7956 

0.  282 

0. 1971 

1.  535 

0. 6544 

0.  154 

0.  3271 

0.  303 

0.  1789 

0.  151 

0.  3306 

0.  336 

0,  1521 

0.  152 

0.  3294 

0.  288 

0. 1918 

1.  271 

0.  2970 

0. 149 

0.  3329 

1.  668 

0.  8873 

0.  lr,4 

0.  3271 

1.  568 

0.  7089 

0. 152 

0.  3294 

0.  276 

0.  2025 

1.  557 

0.  6905 

0.  283 

0. 1962 

7  035 

1.  7134 

0.  232 

0. 2440 

1.  7i4 

0.  9761 

0.  137 

0.  3469 

0.  299 

0.1323 

0.  167 

0.  3124 

0. 142 

0.  3410 

2 19.  597 

-  Ell. 84-19 

1 


I 
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X  =  16. 567/27  =  0. 726 
r 

a-2  =  11.  8449/27  =  0.  4387 

r 

<r  =  V°.  4387  =  0.  6623 


c .  Computation  of  Mean  and  Variance  of  Ambropio  and 
Wortman's  Model  (Z)  for  the  Experimental  Conditions 
of  this  Study 


Model  ~- 


0.  14  3e 


3.  24/M* 


I 


; 

J 

( 

( 


3 

-  ’i 


1 

X. 

i 

X 

aw 

(X.  -  X  ) 
l  aw 

2  x. 

1 

X  (X  -  X  )' 

aw  1  aw 

t 

E 

1. 133 

0.  6875 

0.1984 

.  0.176 

0.  2616 

E 

\ 

1 

1.  642 

0.  9110 

0.176 

0.  2616 

1.444 

0. 5722 

1.493 

0. 6568 

0.  218 

0.  2199 

3.180 

6.  2125 

1 

0.  229 

0. 2097 

1.910 

1.4945 

j 

> 

0.  214 

0. 2237 

0.  218 

0. 2204 

0.175 

0.  2626 

0.  222 

0. 2166 

i- 

■j 

X. 

l 

X 

aw 

(X.  -  x  ')2 

i  r 

X. 

l 

X  (X  -  X  )2 

aw  i  r 

n 

r 

0.  212 

0.  2261 

0.  209 

0. 2289 

0.176 

0.  2616 

0.  212 

0.2261  . 

i 

0.175 

0.  2626 

0, 197 

0.2411 

V 

If 

0.176 

0.  2616 

0.172 

0. 2657 

V. 

0.  951 

0.  0694 

0.179 

0. 2636 

i- 

1.783 

1.  2096 

0.173 

0.2647 

■i 

& 

1.  519 

0.  6914 

218. 564 

216.  3939 

i  - 

X  = 

aw 

18.564/27  = 

0.  6875 

f 

(B-18) 

0-2  = 
aw 

16.  3939/27  = 

0.6072 

t: 

c  = 

VO.  6072  = 

0. 7792 

I 

l 


t 

i 

? 

i 


i 

i 

r 

i 


-r- 


'  mtnrt  i  n1  ■i  mrtr 


Design  of  Experiments 


95  percent  confidence  limits  on  experiment  mean 


<B-19>  S.<0.  95) 


I  1  5Q 

-  0.  793  +  — r-2  (2.  06)  =  0.  793  +  0.  451  =  1.  244  to  0.  342 

“  v  n  ~ 


95  percent  confidence  limits  on  regression  mean 

(B-20)  Xr/.  *  0.  726  +  (1.96)  =  1726  +  0.249  =  0.975  to  0.477 

\0. 9b)  ~  v  n  ~ 

95  percent  confidence  limits  on  Ambrosio-Wortman  Model  mean 
(B-2  1) 

Xaw(0  95)  =  °‘  6875  -  +  0-  2^3  =  0.  981  to  0.  395. 


d.  Tests  of  Means  and  Variances 


Hypothesis: 


( B-22) 


t  (|  =  0,  02  5  d.  f .  +  26)  =  +2,  06 

X  -  X'.  0.793  -  0.726  0.067 

S e/VT*  1. 139/V  27  =  1.139/5.196 


=  0.  305. 


Computed  value  less  than  table  value,  Test  not  significant.  Accept 
hypothesis. 

2  2 

Hypothesis:  S  =  <r 
e  r 


(B-23) 


(•=•  =  0.  025  d.f.  =  26)  +  13.8  to  41.9 


S2 

* ■»-§  =  ■ 7988s- 
V 

r 
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Computed  value  exceeds  table  value.  Test  is  significantly  higher. 
Reject  hypothesis. 

Hypothesis:  X  =  X 

r  aw 


(B-24) 


Z(|  =  0.  025)  =  +  1.  960 


r  =  nTo.  01624  +  0.  02248  =  V  0.03872  =  0.1968 

r-aw  V  27  27 


*  ■  — 


Computed  value  less  than  table  value.  Test  not  significant.  Accept 
hypothesis. 

Hypothesis: 

r  aw 


(B-25) 


X<|  a  0-  ^  d.  f.  ■  26)  *  13.  8  to  41.  9 


<r 

aw 


0.4387  . 
0, 6072  ' 


19.  510. 


Computed  value  between  table  values.  Test  not  significant.  Accept 
hypothesis. 


e.  Computations  for  Operating  Characteristics  Curve  for 
Two-Tail  Test  of  Differences  Between  the  Mean  of  the 
Regression  Model  (Xr)  and  the  Mean  of  the  Ambrosio- 
Wortman  Model  (Sjw) 


Assumption  -  the  variances  are  known  for  both  models. 
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<r  +  N  ff 
aw  r  r  aw 


27(0.4387)  +  27(0.  6072 


These  data  are  plotted  in  Figure  5. 
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0.50 

1.46 
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0.10 

0.1476 

0. 2040 

0.73 
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0.14 
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0.2720 

1.00 
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0.  81 

0.19 

0.  2460 

0. 3400 

1.25 

o.n 

0.74 

0.26, 

0.  2952 

0. 4080 

1,50 

0,46 

0.65 

0.33 

0.3936 

0. 3440 

2. 00 

•  0.04 

0.50 

0.50 

0,  4920 

0. 6800 

2.  50 

■  0.34 

0.32 

0.68 

0.  5904 

0.8160 

3.  00 

•  1.04 

0.17 

0.83 

0.  6888 

0.3520 

3.  50 

•  1.54 

0.09 

0.91 

0.7872 

1.0880 

4.  00 

•  2.04 

0.03 
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PRESENTATION  OF  THE  FIRST 
SAMUEL  S.  WILKS  MEMORIAL  MEDAL* 

p  HI  QnnKKo 

ACCEPTANCE  OF  THE  FIRST  WILKS  MEMORIAL  AWARD 

JohnW.  Tukey 


It  is  indeed  a  pleasure  to  have  Mrs.  Samuel  S.  Wilks  with  us  this 
evening  for  the  presentation  of  the  first  Samuel  S.  Wilks  Memorial 
Medal  Award. 

The  Samuel  S.  Wilks  Memorial  Award  for  statisticians  was  estab¬ 
lished  and  announced  a  year  ago  at  the  Tenth  Conference  on  Design  of 
Experiments  in  Army  Research,  Development  and  Testing.  An  account 
of  the  announcement  of  the  Wilks  Award  is  given  in  the  American  Statisti¬ 
cian  for  December,  1964.  The  idea  for  the  Award  was  due  to  Major 
General  Leslie  E,  Simon  (Ret.),  who  gave  the  opening  paper  at  the  Tenth 
Design  of  Experiments  Conference  entitled  "The  Stimulus  of  S.  S.  Wilks 
to  Army  Statistics".  The  Wilks  Memorial  Award  is  sponsored  by  the 
American  Statistical  Association  through  the  generosity  of  Mr.  Philip 
G.  Rust,  retired  industrialist  of  the  Winnstead  Plantation,  Thomasville, 
Georgia.  The  American  Statistical  Association  accepted  the  obligation 
of  administering  the  Award  and  funds  in  accordance  with  guidance  and 
criteria  which  are  consonant  with  law  and  with  the  wishes  of  the  Army 
representatives,  Mr.  Rust,  and  the  American  Statistical  Association. 

The  name  of  the  recipient  of  the  Wilks  Award  is  announced  each  year 
during  the  annual  Conference  on  Design  of  Experiments  in  Army  Research, 
Development  and  Testing. 

With  the  approval  of  the  President  of  the  American  Statistical  Asso¬ 
ciation  the  Wilks  Award  Committee  for  1965  consisted  of: 

Dr.  Francis  G.  Dressel,  Duke  University  and  the  Army  Research 
Office -Durham 

Dr.  Churchill  Eisenhart,  National  Bureau  of  Standards 

*  After  the  dinner  meeting  at  the  Eleventh  Conference  on  Design  of  Experi¬ 
ments  in  Army  Research,  Development  and  Testing,  the  chairman  of  the 
conference,  Dr.  Frank  E.  Grubbs,  gave  the  above  address.  Professor 
John  W,  Tukey  was  presented  the  first  Wilks  Memorial  Award.  Follow¬ 
ing  his  acceptance  of  this  honor  he  spoke  to  the  group  about  his  friend 
Sam  Wilks. 
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Professor  Oscar  Kempthorne,  Iowa  State  University- 

Dr.  Alexander  M,  Mood,  U.  S.  Office  of  Education 

Major  General  Leslie  e,.  Simon  ) ,  Winter  Florida 

Dr.  Frank  E.  Grubbs,  Ballistic  Research  Laboratories,  Aberdeen 
Proving  Ground,  Maryland  -  Ghalrman 

The  Wilks  Award  Committee  met  during  the  annual  meeting  of  the 
American  Statistical  Association  in  Philadelphia  on  8*10  September  1965. 
Many  candidates  for  the  1965  Wilks  Award  were  considered  based  on 
nominations  from  individuals  and  also  statisticians  thought  worthy  of 
consideration  by  the  committee. 

The  Wilks  Award  is  not  limited  to  contributors  to  design  of  experi¬ 
ments  activities  in  connection  with  Army  research,  development  and  test¬ 
ing,  but  rather  all  statisticians  who  have  made  significant  contributions  to 
the  general  field  of  Army  statistical  endeavors,  whether  theoretical  or 
applied,  are  eligible,  Moreover,  persons  eligible  for  the  award  include 
not  only  government  statisticians  but  also  those  from  universities  and 
industry.  The  annual  programs  of  the  Conference  on  Design  of  Expert 
ments  in  Army  Research,  Development  and  Testing  indicate  rather  broadly 
the  nature  of  statistical  endeavors  of  interest  to  the  Army,  but  the  achieve¬ 
ments  of  those  being  considered  for  the  award  need  not  be  restricted  to 
these  areas.  Rather,  as  indicated  earlier,  the  awardee  is  selected  for 
the  advancement  of  scientific  or  technical  knowledge  in  statistical  efforts 
which  co-incidentally  will  have  benefited  the  Army  and  government  in  one 
way  or  another. 

As  a  result  of  the  committee  meeting,  it  is  a  great  pleasure  to 
announce  that  Professor  John  W.  Tukey  of  Princeton  University  has  been 
selected  to  receive  the  first  Samuel  S.  Wilks  Memorial  Medal  Award. 

Professor  Tukey  has  long  been  an  authority  on  the  statistical  analysis 
oi  data  and  has  received  wide  recognition  for  his  many  contributions  to 
mathematical  statistics  and  applied  statistics  in  many  different  fields. 
Professor  Tukey  has  contributed  to  the  Army  Design  of  Experiments 
Conferences  from  the  beginning  and  gave  freely  of  his  time  to  promulgat¬ 
ing  the  uses  of  statistics  in  Army  applications,  DOD  applications, 
Government  and  industrial  applications.  The  citation  for  the  first  Wilks 
medalist  reads  as  follows*. 
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To  John  W.  Tukey  for  his  contributions  to  the  theory 
nf  statistical  inference,  his  development  of  procedures 
for  analyzing  data,  and  his  influence  on  applications  of 
statistics  in  many  fields. 

Upon  receiving  the  Wilks  Medal,  Professor  Tukey  responded  as 
follows: 

We  are  met  to  honor  Sam  Wilks'  memory.  All  of  us  would  have  so 
much  preferred  to  have  had  him  here  instead.  Many  of  us  knew  him  for 
ten  or  twenty  years,  some  for  thirty.  No  matter  whether  we  knew  him 
initimately  as  a  close  colleague  and  friend  or  only  as  someone  met  once 
a  year  at  such  a  recurring  event  as  this,  we  all  respected  him  and  all 
he  stood  for.  In  this  we  are  but  a  small  sample. 

The  memorial  minute  of  the  Princeton  University  faculty  begins 
thus:  "Samuel  Stanley  Wilks  died  in  his  sleep  on  March  7,  1964  at  the 
peak  of  a  distinguished  career  in  teaching,  research,  and  public  service. 
His  sudden  death,  without  any  warning  leaves  many  friends  and  associates 
stunned  by  a  sudden  loss  of  a  man  upon  whom  they  depended  for  advice  on 
problems  large  and  small,  for  a  wise  appraisal  of  proposals  under  con- 
slderation,  for  getting  many,  jobs  done---a  man  instinctively  so  friendly 
and  fair  that  everyone  responded  to  him  with  great  affection.  His  death 
terminates  a  quiet,  penetrating,  and  influential  leadership  in  the  work 

of  many  organizations - especially  in  mathematics,  statistics,  and 

social  science — to  which  he  brought  wisdom,  commitment,  persist¬ 
ence,  and  a  remarkable  sense  of  the  importance  of  new  developments. 

His  passing  leaves  an  emptiness  in  so  many  plans,  that  one  wonders  how 
one  man  was  so  versatile  and  did  so  much". 

The  memorial  notice  of  the  American  Philosophical  Society  approaches 
its  end  thus  [l]  :  "In  his  service  to  our  Society,  Sam  showed  all  the  won¬ 
derful  characteristics  we  have  noticed  elsewhere:  quiet,  modest 
diligence,  deep  wisdom,  a  technical  skill  that  was  always  adequate  to 
any  demand;  the  ability  to  comprehend,  and  bring  others  to  comprehend, 
the  broader  issues.  "  The  notice  then  ends;  "Mosteller's  memoir, 
written  for  statisticians,  was  fittingly  entitled:  "Samuel  S,  Wilks: 
Statesman  of  Statistics".  As  members  of  Benjamin  Franklin's  owr. 
society  it  is  only  right  that  we  salute  ourefeparted  colleague  and  friend  as 
"Sam;  A  Quiet  Contributor  to  Mankind". 
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On  the  afternoon  of  his  death  Sam  told  my  wife:  "Now  that  so  many 

1  .  *  »  1  _  Jl.  ft.  ft  t  .ft!  .  .  J _ _  --ft - ..ft. 

i<i  >"i  i  itn  1111*1  ^  i  utMttttt  liuvit  mo  uu:  icauui^  otaviotit  o  ucpu  a  uuwnwo  v( 

ih<  n-  ..  mi  n  ‘s  time  that  John  and  J  worked  out  something  new  to  do.  "  1 
r  s.inv  S.nn  agnin;  what  we  nrc  working  out  In  Princeton  today  is  not 
wri.ii  it  would  have  been  uncier  bis  leadership,  but  wo  can,  and  will,  do 
our  In  st  to  make  the  new  Department  of  Statistics  something  of  which 
Sum  would  have  been  proud. 

For  thirty  years  he  kept  Fine  Hail  statistics  in  balanced  contact  with 
mathematics  on  the  one  hand  and  with  a  wide  variety  of  applications  on 
the  other,  showing  clearly  by  his  example  how  tt  was  best  to  combine  both. 
His  recognition  of  the  dangers  of  tight  Gaussia*  assumptions  led  him  to 
pioneer  with  non-parametric  methods.  His  recognition  of  the  growing 
import  mice  of  computing  came  very  early;  the  first  punched  card  equip¬ 
ment  on  the  Princeton  campus  occupied  the  room  next  to  his  office. 

As  a  unified  Princeton  statistics  comes  into  being  and  grows,  we 
will  do  all  wo  can  to  continue  his  tradition.  We  will  emphasize  the  need 
tor  combining  contact  with  mathematics  and  contact  with  applications. 

We  will  da  all  wc  can  to  bring  statistics,  computer  science,  and  the  use 
of  computer  facilities  ever  closer  together.  We  will  try  to  be  ever  more 
real  i stir  in  understanding  the  problems  of  the  real  world  and  in  formulat¬ 
ing  those  pale  copies  of  real  problems,  whose  solutions  serve  to  guide 
us  us  v*t>  face  reality.  Wo  can  do  no  less  if  we  are  to  fdllow  his  noble 

I  I'iiii  it  mu 
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TAl'GET  COVERAGE  PROBLEMS 


William  C.  Guenther 

University  oi  Wyoming,  Laramie,  Wyoming 


Much  of  the  material  contained  in  this  paper  is  a  review  of 
literature  which  has  appeared  in  many  different  publications. 

The  definition  of  a  single  shot  coverage  problem  which  was 
given  in  a  paper  by  Guenther  and  Terragno  [l]  is  extended  to 
a  multiple  shot  case.  The  results  which  were  reviewed  in 
reference  1  appear  here  in  abstracted  form  since  they  are  use¬ 
ful  for  the  new  extension.  Some  models  for  the  multiple  shot 
case  are  considered  in  detail.  The  latteT  include  some  for 
which  results  have  not  been  previously  published.  It  is  hoped 
that  this  paper  will  be  a  coordinating  force  for  future  research. 

In  recent  years  a  large  number  of  publications  have  appeared  on  proba 
bility  problems  arising  from  ballistic  applications,  Many  of  these  papers 
and  reports  are  concerned  with  topic*  which  are  often  referred  to  at; 
coverage  problems.  A  definition  of  a  coverage  problem,  which  yields 
many  interesting  models  as  special  cates,  appears  in  a  paper  by  Guenther 
and  Terragno  [l]  and  will  be  reproduced  here.  That  definition  was  for 
the  single  shot  case  but  only  minor  modifications  are  required  to  extend 
it  to  a  multiple  shot  situation.  Further  modifications  may  be  necessary 
if  it  is  desired  that  the  definition  yield  certain  other  problems,  which  have 
already  been  investigated  or  may  be  formulated  in  the  future,  as  special 
cases, 

Although  most  work  in  thiB  field  has  been  restricted  to  the  two- 
dimensional  case,  some  applications  are  meaningful  in  three  dimensions, 

It  is  doubtful  that  the  coverage  problem  has  any  useful  interpretation  in 
more  than  three  dimensions.  We  will  use  n-dimensional  notation  not  only 
because  it  includes  the  cases  n  =  Z  and  n  =  3  but  alto  because  results 
one  derives  can  occasionally  be  used  in  unexpected  places  where  n  dimen¬ 
sions  are  meaningful. 

For  brevity  we  will  use  the  notation  =  (x^,  x^>  ....  x^)  and 
f  «•<*,)  will  represent  an  n-fold  integral. 
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DEFINITION  FOR  THE  SINGLE  SHOT  CASE,  Before  attempting  to 
define  a  coverage  problem,  let  ua  consider  a  special  case  which  will  help 
to  ir.trod’v~?  n(  the  essential  ideas  and  language.  Suppose  that  a 

point  target  is  located  at  the  origin  of  a  two  •dimensional  coordinate  system. 
A  weapon  with  killing  radius  R  is  aimed  at  the  origin  with  the  intention  of 
destroying  the  point  target.  When  the  weapon  arrives  at  the  target,  the 
latter  is  located  at  3  (x^j,  x^).  a  randomly  selected  position  within 

or  on  a  circle  of  radius  D  centered  at  the  origin  (see  Figure  1).  That 


Fig.  1.  X2  is  point  target  and  weapon  has  killing  radius  R. 


/ 


is,  the  probability  density  function  of  X.  is 
/  6 


S(*21-  x22^  =  »  1 


ttD 


°  s  4  +  4  ^  d2- 


Assume  that  aiming  errors  are  circularly  normally  distributed  with 
unit  variance  so  that  the  center  of  the  lethal  circle  X^  »  (x^,  x^)  has 
p.  d. f .  ’  * 


■  h  «*p[-  ^<4i +  *i2>1  • 


f(xu,  xu) 
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Now  a  given  point  will  be  destroyed  if  the  impact  point  of  the  weapon 
is  within  R  units  of  X^.  The  probability  that  this  happens  is 

h*x2r  x22-  s  y_  §  f(*ir  X12)  dxll  dx12 

C1 

2  2  2 

where  Cj  is  the  region  (x11“x2i'  +  (xi2"x22^  ^  ‘  Pro^al5^^y 

of  destroying  the  target  (that  is,  the  probability  that  the  impact  point  is 
within  R  units  of  the  target  given  that  the  target  is  as  likely  to  be  at  one 
point  as  at  any  other  within  the  circle  of  radius  D)  is 


P(R,D)  =  J  J  h(x21,  x22)  g(x 


k21’  X22^ 


dx21  dx22 


2  2  2 

where  C2  is  the  region  x^  +  x22  <  D  .  The  evaluation  of  P(R,D)  for 

any  number  of  dimensions  is  discussed  in  Section  2  of  reference  1  and  is 
mentioned  in  the  abstract  of  that  paper  which  appears  in  the  next  section. 


Now  let  us  formulate  the  definition  of  a  coverage  problem  for  the 
single  shot  case.  Let  be  the  Impact  point  of  the  weapon,  X2  be  the 

position  of  the  target  at  the  time  of  impact,  P^X^,  X2)  a  probability  of 

destroying  the  target  for  given  values  of  and  X2  (sometimes  called 
the  damage  function),  F(X  )  =  the  distribution  function  of  the  impact 
point,  G(X2)  =  the  distribution  function  of  Xg,  Then 

p2<x2>  *  r  Pl(Xr  X2>  dF<Xl> 

-00 

=  probability  a  given  X2  is  destroyed 

and 

QO 

p(-)  *  *  y  p2tx2>  d°<x2> 

-00 

a  probability  of  destroying  a  point  target  whose 
position  is  governed  by  C(X2). 
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We  will  define  a  single  shot  coverage  problem  aa  the  computation  of  a 
probability  of  the  type  P(*)i  that  is,  the  evaluation  of 

(1)  P(-)  =  j*  J  PjtXj.  X2)  dF(Xj)  dG(X2)  . 

•00  "00 

All  three  functions  P^(X^,  X^).  F(X^),  and  C(X^)  (and  consequently 
P(-)  )  will  in  general  depend  upon  parameters. 

Although  the  order  of  integration  in  (1)  has  proven  to  be  the  most 
efficient  in  the  majority  of  problems  which  have  been  studied,  there  is 
no  reason  why  that  order  cannot  be  reversed  if  it  is  profitable  to  do  so. 
This  change  gives 

00  Qo 

(2)  p(-)  =  £  y  P^Xj.  X2)  dGfX^dF^)  . 

•00  -00 

Several  special  cases  are  worthy  of  consideration.  If 


(a) 

pi<xr  x2> 

=  1, 

Xjf  region  (usually 

(3) 

=  0, 

otherwise 

(b) 

l<*2> 

=  1, 

x2  ■ E  *  <bi . V 

«  0, 

otherwise, 

then  (1) 

reduces  to 

(4) 

P(.) 

*  I  dF(X!> 

which  is  the  probability  content  of  region  under  distribution  F(X^). 
If  (a)  of  (3)  is  satisfied  (sometimes  called  a  zero-one  damage  function) 


i 


Design  of  Experiments  ?77 

but  G(X^)  does  not  concentrate  all  the  probability  at  one  point,  then  (l) 
reduces  to 


(5) 


P(-) 


dF(Xr)  dG(X2)  , 


where  in  general  is  defined  in  terms  of  both  X^  and  X^. 
If  X2  is  uniformly  distributed  over  a  region  C^,  that  is 

<6>  «‘X2>  ■  V^T'  X2  •  C2 

=  0,  otherwise 


where  V(C.,)  is  the  volume  of  C^,  and  the  damage  function  is  zero-one, 

then  P(> )  can  be  interpreted  as  the  expected  fraction  of  overlap  of  the 
region  of  total  destruction  and  a  target  area  C^.  To  see  this  integrate 

in  reverse  order.  Given  a  value  of  (see  Figure  2) 


Circular  area  of  total  destruction  and  target  area  C^. 


Fig.  2. 
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is  captured  if  itilies  in  the  region  common  to  and  C^.  The  proba¬ 
bility  that  happens  is 

c  i  ...  v(xl) 

J  vfcTT  “2  =  vfcTf 

Cinc2 

where  V(X^)  is  the  volume  common  to  ^and  C 2 
integrating  over  X^  we  get 

f  V(x  ) 

j  WJ  dr(xi>  =  E 

•CO  b 

which  is,  by  definition,  the  expected  fraction  overlap.  Multiplying  the 
latter  result  by  V(C2)  gives  E[V(X^)]  or  the  expected  overlap. 

When  the  damage  function  is  not  of  the  zero-one  type  and  has  the 

density  (6),  then  P(')  can  again  be  interpreted  as  the  fraction  of  the 
target  area  destroyed.  This  is  best  seen  by  writing  P(  • )  as 

p<->  *  S  w  vkr  “2 

C2 

and  observing  that  since  P^X^)  can  be  interpreted  as  the  fraction  of 
the  point  X2  destroyed,  E[P2(X2)]  is  the  fraction  of  the  target  area 
which  is  destroyed.  Morganthaler  [2]  has  used  this  interpretation. 

SOME  SPECIFIC  RESULTS  FOR  SINGLE  SHOT  CASE--GUENTHER- 
TERRAGNO  PAPER.  A  comprehensive  review  of  results  for  the  single 
shot  case  has  been  published  by  Guenther  and  Terragno  [l]  ,  This  paper 
lists  5B  references  of  which  about  30  deal  directly  with  target  coverage. 

A  thorough  knowledge  of  results  for  the  single  shot  case  is  extremely 
helpful  in  the  multiple  shot  situation.  This  section  will  be  an  abstract 
of  that  paper. 


for  given  Xj.  Then 

'v(Xj)  ■ 
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For  most  models  discussed  in  the  review  it  is  assumed  that  X^  has 
density 


f(X^)  =  f(x^>  •  •  •  i 


(?) 


(2ir)^n  II  <r, 
i=l 


11 


-1 


exp 


1  ?  /  J  \l 

"2  if1  (Xli//trlP 


Section  1  is  devoted  to  probability  content  problems,  special  cases 

n  2  2 

of  (4)  with  the  region  being  2  (x^-b^)  •  R  •  Thus  the  point  B 

i=l 

is  destroyed  if  the  point  of  impact  is  within  R  units  of  the  fixed  point. 

2  2 

If  all  o-^  »  v  ,  then  P(*)  is  the  integral  of  a  non-central  chi-square 
density  function  with  n  degrees  of  freedom  and  non-centrality  parameter 
n  2  2 

2  b  /<r  .  Very  extensive  tables  exist  for  n=2,  adequate  tables  for 
i=l  1 

n  =  3(1)30(2)50(5)100.  Results  are  less  abundant  if  the  variances  are  not 
equal.  However,  for  B  =  o,  n  =  2,  3  and  B  /  0,  n  =  2,  existing  tables 
seem  to  be  quite  adequate. 


Section  2  describes  some  special  cases  of  (5).  The  most  interest¬ 
ing  results  are  obtained  by  using  (7)  with  equal  variances  for  the  density 
n  2 

of  X^  and  2  (x^-x^)*  ^  R  for  C^.  Thus,  if  X^  is  within  R  units 

of  Xg,  X2  is  destroyed.  For  these  cases  the  probability  can  be 
expressed  as  the  integral 


(8) 


P(.)  = 


2 

V  dG(X2) 


<r 


where 


is  the  non-central  chi-square  distribution  function 
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^22  22 

with  n  uogre  e>  of  freedom  and  non-centrality  parameter  £  x  /«r  =  r  /<r  , 

i=l  2i 

Q(r/<r)  ie  the  distribution  function  of  r/<r  (which  is,  of  course,  determined 

by  G(X_)  ).  The  evaluation  of  the  integral  (8)  is  discussed  for  the  cases: 

it  T* 

1.  The  distribution  of  X.  gives  equal  weight  to  each  point  on 
n  2  2  2 

£  x,,  =  D  ,  no  weight  elsewhere.  That  is,  X.  is  uniformly 
i=i  Zi  1 

distributed  over  the  surface  of  a  sphere  of  radius  D  centered  at  the 
origin. 

II.  X^  is  uniformly  distributed  within  or  on  a  sphere  of  radius  D 
centered  at  the  origin.  Thus, 

*<X2>  =  vtE)  • 

=  0  i  elsewhere 

where  V(D)  is  the  volume  of  the  sphere. 

III.  X^  has  a  density  g(X^)  taking  on  the  form  (in  spherical  coordinates) 

p(f.<»1 . *  {ZDtt11'1)'1  -  0  5  *  £  D 

0  a  £  it,  isl, .  .  ,  ,  n-2 
0  ^2tt 

=  0,  elsewhere 

so  that  the  spherical  coordinates  are  each  independently  and  uni¬ 
formly  distributed. 

IV.  r/(r  has  a  gamma  distribution. 

2  2 

V.  r  /<r  has  a  gamma  distribution, 


VI.  r/<r  has  a  beta  distribution. 


301 


:  — 


vi  imcako 


Finally,  a  rase  not  falling  under  (8)  in  which  and  X^  both  have  density 


(7)  (but  with  different  variances)  is  discussed.  Perhaps  II  is  the  most 
interesting  since  it  generalizes  a  well  known  result  by  Germond  [3]  .  For 
this  case 


(9) 


P(- )  •  P(?J>  --  H(4;  »+2.  ^  )  +  (Dy- 
<r  <r  ' 


~/D 


)  H(— y  n. 


and  evaluation  is  accomplished  by  using  tables  of  the  non-central  chi- 
square  distribution  [4]  . 


In  Section  3  a  few  models  with  damage  function 

n  2  2 
P1(X1,X2)  =  exp[  -  S  (x2.-Xli)72  \C] 


are  discussed.  Again  Xj  is  assumed  to  have  density  (7),  Then  P(- ) 
is  evaluated  for 

I.  Same  as  Case  I  of  Section  2. 

II.  Same  as  Case  II  of  Section  2  except  that  unequal  variances  are 
permitted  in  (7). 

III.  Same  as  Case  III  of  Section  2. 

IV.  Same  as  Case  V  of  Section  2. 

V.  Both  Xj  and  X2  have  density  (7)  but  with  different  variances. 

EXTENDING  THE  DEFINITION  TO  THE  MULTIPLE  SHOT  CASE. 
Again,  having  a  special  problem  in  mind  will  help  in  constructing  the 
definition.  Let  us  consider  the  following  case  discussed  by  Jarnagin 
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and  Di  Donato  [5]  .  A  big  bomb  is  aimed  at  a  point  target  located  at  the 
origin  of  a  two-dimensional  coordinate  system.  When  the  weapon  arrives 
at  the  target,  the  latter  Is  located  at  a  randomly  selected  position 

within  or  on  a  circle  of  radius  D.  Assume  that  aiming  errors  for  the  big 
bomb  are  circularly  normally  distributed  with  unit  variance.  That  is,  when 
the  big  bomb  detonates  its  position  is  governed  by  the  density 

VX31*  X32^  =  17  exp  1“  2^x31  +  X32^  1 


At  detonation  the  big  bomb  scatters  N  bomblets,  each  with  lethal  radius  R, 
with  impact  points  uniformly  and  independently  distributed  over  a  circle 
of  radius  A.  Thus,  the  density  of  X^,  the  impact  point  of  a  bomblet, 
is  for  given 


(X11‘ X31^  +^x12_x32)  -  A 
’  • 

•  otherwise. 


Now,  given  that  the  target  is  at  X,  and  the  big  bomb  detonates  at  X,, 

u  3 

X2  is  captured  by  a  bomblet  if  X^  is  within  a  diotance  R  of  (see  E  igure 
3).  The  probability  that  this  happens  is 


2  2  2 

where  is  the  region  (x^-x^^)  +(X12’XZ2^  -  R  '  The  target  will  be 

captcred  if  it  is  covered  by  at  least  one  bomblet.  This  happens  with 
probability  l-(l-Pg)  because  of  the  independence  condition.  The  prob¬ 
ability  that  the  target  will  be  captured  regardless  of  where  the  big  bomb 
detonates  is 

00 

h(X2)  =  j*  [1-(1-PS)N]  f3(x3)dx3  . 


-00 
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FIG.  3.  Big  bomb  detonates  at  X^  bomblet  at  X^. 
Target  is  at 

Finally,  the  probability  that  the  target  will  be  captured  no  matter  where 
it  is  located  is 

J  h(x2)  g(x2)  dx2 

C2 

where  C<2  is  the  region  x21  +  x*2  £  D2  and 

g(X2)  »  .  x21  +  *22  ^  D 

ttD 

=  0  ,  otherwise. 

This  problem  will  be  discussed  further  in  a  later  section. 

To  generalise  the  above  result  let  X  -  the  impact  point  of  the  big 

bomb,  f3(x3)  a  the  distribution  function  of  X^,  X^  =  impact  point  of 
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a  bomblet,  F^X^  |  X^)  =  conditional  distribution  of  given  X^,  the 

same  for  each  of  the  N  bomblets  with  all  N  impact  points  being  independ 
ently  distributed,  Xj  =  position  of  target  when  the  bomblets  impact, 

G(X2)  =  distribution  function  of  the  point  target,  P^X^,  X2)  =  probability 

of  destroying  the  target  for  given  values  of  X^  and  X^,  Pg  =  probability 

of  capturing  the  target  for  any  one  bomblet  given  and  X^.  Then 

t0 

ps  =  f  P^.X,)  dF13(X1|X3)  ' 


and 

00  to 

(10)  p(-)  =  y  J  tl  .  (l-p/)  dF3(X3)  dG(X2) 


is  the  probability  of  destroying  the  target.  Expanding  the  binomial  under 
the  integral  in  (10)  leads  to  the  alternate  form 

N  *  * 

(u)  ?<•)  *  r(.i)k+1  <”)  j-  J  do(x2). 


We  will  define  an  n-dimensional  coverage  problem  as  the  evaluation  of  a 
probability  of  the  type  given  by  (10)  or  (11). 

If  X3  has  density 

(12)  f.(X3)  =  l'  X3  =  B  (a  fixed  point) 

=  0,  otherwise 


then  (10)  reduces  to 
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(13) 


00 

P(  )  =  j  [1  -  (1-PS)N1  dG(Xz) 
-00 


where  X^  =  B  in  Pg.  Formula  (13)  yields  P(.)  for  N  shots  aimed  inde¬ 
pendently  at  B  (  at  the  origin  if  B  -  0).  Farther  if  N  =  1,  (13)  becomes 

00  00 

y  y  P^Xj.X^  dF(Xj)  dG(X2)  , 

-00  -  00 

the  single  shot  formula  (where  F(X^)  =  F^(X^  |  B)  ). 

SOME  5PECIAL  CASES  OF  FORMULA  (13), 

Big  Bomb  Hits  Origin  with  Probability  1,  Zero-One  Damage  Function 

Assume  that  aiming  errors  of  the  big  bomb  are  governed  by  the  p.d.  f. 
of  (12)  with  B  =  0  and  that  X^  is  uniformly  distributed  over  a  sphere  of 
radius  D  centered  at  the  origin,  that  is,  has  p.d.f. 


-1  n  2  2 

(14)  g(X2)  =  [V(D)]  \  L  x2i  <  DC  (region  C2) 

=  0  ,  otherwise 

where  V(D)  is  the  volume  of  sphere  of  radius  D.  We  will  also  assume 
that  the  density  of  X^  given  X^  is 

(15)  f13^Xl  I X3^  =  t(ZlT)2n  _n  »ii]  1  **P[-  |  2C  (xii-x3i)2/<rfi] 

with  =  o-,  i=l,  2,  ...  ,  n  and  where  =  0,  i=l,  2,  .  .  .  ,n  because  the 
big  bomb  hits  the  origin  with  probability  1.  Then 
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ps  ■  I  dFl3‘Xl  I  °> 

C1 

n  2  2 

where  Cj  is  the  region  2  (x^-x^)  ^  ^  '  ft  is  well  known  that  this 
integral  has  the  value 


(16) 


whore 


2 

r 


n  2 

£  x,.  .  Hence 

i=l  21 


The  multiple  integral  converts  to  a  single  integral  by  virtue  of  the  result 
on  page  248  of  [1]  .  We  know  from  Formula  (9)  that  the  single  integral  in 
(17)  can  be  expressed  in  terms  of  H  functions  for  k  =  1.  A  corresponding 
result  for  k  >  2  may  be  possible  but  it  is  unknown  at  the  present  time. 

For  the  case  n=2,  Jarnagin  [6]  has  prepared  tables  of  (17)  for  R/cr 
=  .  005(.  005).  05(.  01).  10(.  02).  20(.  05)l(.  1)2(.  2)4(.  5)10,  D/<r  =  .05,  l.(.l) 
4(.5)12,  N  =1(1)20.  Also  included  is  an  inverse  table  giving  the  number 
of  bomblets  N  required  to  make  P(< )  =  .  05(.  05).  95  for  the  range  of 
D/ir  given  above  and  with  R/<r  ranging  over  values  required  to  make  N 
go  from  1  to  999. 
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Big  Bomb  Hits  at  Point  B  with  Probability  t,  Exponential  Damage  Function 


Assume  that  the  damage  function  is 


(18) 


PjtXj.Xj)  =  exp(-  ±  E  (x2i-xli)2AZ) 


and  that  the  p.  d.  f.  of  X^  given  =  B  is  given  by  (15)  with  =  b^, 
i-1 ,  2,  .  .  .  ,n.  Then  an  easy  integration  yields 

PS  =  n,  2X"  2.'i  exp(-l  *  *  a2))  • 

ff  ((T  +  A  )  1=1 

i=l  1 

Expanding  the  binomial  in  (13)  we  can  write 

(!9)  '  2 

N  .  ..  r  \nk  u  n  (x,.-b.) 

P(0  ■  =  (-1)  f  ~T~‘TTk-  exPK‘t>  z  ■  T . T" 1  d°(x2)  * 

k=l  W  J  ?  (<r:+AZr  2  i*l  (<r„  +  2 


..  -<*u*  y 

i=i 


First  assume  that  X  is  uniformly  distributed  over  an  ellipsoid 
whose  center  is  at  the  origin  and  whose  axes  are  parallel  to  the  coordinate 
axes,  Then 

«1  n  2 

g(X2)  =  [V(C2)]  ,  2  (x2i/ai>  -  1  (the  region  C2) 

=  0  ,  ■*  otherwise 

—  2  2- 

where  V(C2)  is  the  volume  of  C.,.  Then  if  we  let’  k  2  (x.^ -b^)/(<r^+  ^  )* 
=  y^,  the  probability  (19)  becomes 
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(20) 
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t-Dk+1  (”)  X"k(2.)in 


k*l  V(C,)  k1”  5  (»?  ♦  7)  k 

L  i=l  “ 


where 


\  ■ 


I  ‘o  <v> 


dY  , 


'2k 


fg(Y)  is  the  standard  normal  density  in  n  dimensions,  and  C2fc  is  the 


region 


n 

£ 

i=l 


>yi  +  '  2  . \2vi 

(V  4  X  > 


.  2  y  2,  2  .  x2. 

k  /»!  (»u  +  A  ) 


S  1 


Tables  from  which  can  be  obtained  when  m=2  have  been  prepared 

by  Germond  [7]  ,  DiDonato  and  Jarnagin  [8]  ,  Lowe  [9]  ,  and  Rosenthal 
and  Rodden  [10]  .  If  =  b^  =  0  so  that  the  ellipse  is  centered  at  the 

origin,  then  can  be  evaluated  from  the  tables  published  by  Esperti  [11], 

Harter  [12]  ,  DiDonato  and  Jarnagin  [1  3]  ,  and  Marsaglia  [14]  ,  All  the 
above  tables  are  described  by  Guenther  and  Terragno  [l]  .  Groves  [15] 
derived  (20)  for  the  case  n  *  2  and  includes  a  16  page  table  of  Jfc  for  this 
case  (with^  =  b^  =  0)  in  his  report. 


If  all  <r^  »  v  ,  and  =  D ,  then 


J.  =  H 
k 


i  ^  ,  2 

L  D  (or  +  X  ) 


2 

r 


2  2 
or  +  X 


n 

Z 

i=l 


where 
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further  if  B  =  0,  then  reduces  to  a  central  chi-square  probability. 

For  both  the  latter  two  cases  many  tables  are  available  and  a  description 
of  these  tables  is  found  in  Section  1  of  [l]  . 


If  in  (19)  we  take  B  *  0,  = 

weight  to  each  point  on  the  sphere 


<r  and  assume  that  G(X  )  gives  equal 

n  2  2  “ 

£  x, .  =  D  ,  then  (19)  reduces  to 

i=l  * 


(21) 


P(-)  = 


N 


k=l 


M)k+1(£) 


^nk 

(T2  7 


exp 


kD 


2(trZ+\Z) 


since  everything  comes  out  in  front  of  the  multiple  integral  except  dG(X^) 
which  when  integrated  over  the  whole  space  yields  1.  For  a  G(X2)  so 
chosen,  X_  picks  its  position  at  random  on  the  surface  of  the  sphere.  The 

M 

answer  is  the  same,  of  course,  no  matter  how  G(X2)  assigns  probability 
on  the  surface  of  the  sphere  but  uniform  assignment  ia  the  most  realistic 
model. 

As  one  further  model  let  us  assume  that  B  =  0  and  has  p.  d.  f. 

(22)  g(X  )  =  [(2ir)*n  I!  r  ]  ‘l  exp[-  j  E  (x-./^T,.)2]  , 

■  i=l  i=l 

Then  (19)  readily  reduces  to 


P(-)  = 


N 

23 

k=l 


■Dk+1<$ 


,  nk 


n  r,  2  .  2,  (k-l) 

r  [(«ru+X  ) 


i=l 


^k<r2i+°’u  +x2)^ 


SOME  SPECIAL  CASES  OF  FORMULA  (10). 

The  Jarnagin»DiDonato  Model 

Let  us  return  to  the  example  which  we  used  to  introduce  multiple 
shot  coverage  problems  but  generalise  the  discussion  to  n-dimensions. 
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Then  given  la  uniformly  distributed  over  a  sphere  of  radius  A 
centered  at  so  that 

*13(3^1X3)  a  (V(A)]  "l,  (3Ci1-x3i)2^  aZ  (region  C3) 

=  0,  otherwise, 


X,  1b  uniformly  distributed  over  a  sphere  of  radius  D  centered  at  the 
origin  so  that  it  has  the  p.  d.  f.  given  by  (14),  and 

(23)  ^(Xj)  =  t(Zir)^n  ^  <r3i]  _1  «XP  t-j  2  (x3i/<r31)2]  • 

Here  V(A)  is  the  volume  of  a  sphere  of  radius  A,  We  will  assume  that 
<r3i  a  it  ,  i=l,2, . . ,  ,n  and  for  convenience  (as  DIDonato  and  Jarnagin  have 

done)  we  will  take  <r  -  1  which  means  all  distances  are  expressed  in 
standard  units.  The  damage  function  is 

.P^Xj.Xj)  *  1,  5  (x^-x^)2  g  R2.  (region  C^) 


Then 


ps  - 


y  vr 


vw  “1 


dX,  a 


2  2  2 

where  t  =  £  (x5,-x  )  and  V(t  )  is  the  volume  common  to  C.  and 

i=l  "  31  1 

C y  :  Hence,  since  all  functions  appearing  in  (10)  are  known,  the  2n>fold 

integral  could  be  written  down  with  the  integrand  expressed  in  terms  of 
X2  and  X3> 
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r 

t- 

fi 


?■ 
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2  2 

Some  simplification  is  possible.  We  seek  E[u(t  )]  where  u(t  ) 

N  2 

si-  (1-P  )  .  If  the  density  of  t  were  known,  then  P(-  )  could  be  expressed 
s  2 

as  a  single  integral  with  integrand  in  t  .  We  know  from  working  with  2 
single  shot  coverage  problems  that  the  density  of  t2  given  r2  =  £ 

2  1-1 

is  non-central  chi-square  with  non-centrality  parameter  r.  This  is 


(24) 


221  +(n"2)/2  .122 
h(t;n,r)  ■  j(-)  exp[-^(t  +r2)]  I(n_2j/2(tr) 


where  I,  _,,,(x)  is  the  modified  Bessel  function  of  order  (n-2)/2.  The 
1  n-Zj/Z  - 

density  function  of  r  (see  [l]  ,  p.  248  for  the  density  of  r)  is 


2  nfrMn“2>/2 

g(r2,  a  Si£j - 


2D 


2  ^  2 
0  £  r  <  D 


otherwise. 


2  2  2  2  2 

Hence  the  joint  distribution  of  t  and  r  Is  h(t  j  n,  r  )  q(r  )  and 


(25) 


f*  ( A+R)  pD  2  2  2  222 

P(-)  -  J  )  u(tZ)  h(t  jn,r)  q(r2)  dr2  dtZ, 

0  0 


a  double  integral. 

For  the  2-dimensional  case  a  further  simplification  is  possible 
since  (24)  is  then  symmetric  in  t2  and  r2.  Thus,  in  (25)  the  integration 
of  r2  yields  H(D2;  2,t2)  so  that 


(26) 


p(-)  ■  j 


r<A+R)  «*.„*>«*. 
0  c 
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The  Jarnagin  and  DIDonato  report  includes  over  100  pages  of  graphs 
which  yield  the  P(- )  of  (26).  Two  cases  are  considered.  For  Case  I, 

R  <  A  and  20  £  N  S  500  for  various  values  of  D,  A,  and  irR2,  For  Case 
II,  R  >  A  and  1  £  N  £  20  for  selected  values  of  R,  D,  A.  The  Case  1 
graphs  give  irD^P(>)  while  the  set  for  Case  11  give  P(* )  directly. 

Various  approximations  to  P(«)  are  discussed. 

From  a  practical  point  of  view  the  most  interesting  case  is  R  <  A. 
For  this  situation  it  is  immediately  apparent  that  bounds  on  the  P(. )  of 
(26)  are 


[1  -  (1  -  K)N1 


H(D2;2,t2)  dt2  <  P(-) 


“g"  H(D2;  2,t2)  dt2  . 


Both  integrals  appearing  in  (27)  can  be  expressed  in  terms  of  H  functions 
by  using  (9).  The  H  functions  in  turn  can  be  evaluated  by  using  the 
tables  of  Hayman,  Govindarajulu,  and  Leone  [4]  .  Of  course,  the  smaller 
the  R  the  closer  the  bounds  will  be. 


EXPONENTIAL  DAMAGE  FUNCTION,  DETONATION  POINTS  OF  BIG 
AND  LITTLE  BOMBS  NORMALLY  DISTRIBUTED^  Assume  that  the 
damage  function  is  given  by  (18),  the  density  of  !x^  given  X3  by  (15), 

and  the  density  of  by  (23).  Then  a  straight  forward  evaluation  yields 

<0 

ps  =  J  >!<*,.  X2)f13(X,|X3)  dXj 


n 

IT 

i=l 


<»iV 


r  1 

exp  [-  T 


n 

Z 

i=l 


^x3l*x2i 


)2/(<rfi+X2)] 
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The  same  kind  of  evaluation  next  gives 
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OB 

$  psWdx: 


Aknexp[(~)  jE.  X2i/(k(r3l+(ru+x2)l 

n  U  2^2.(k-l) 2  2,2,,  i 

w  )'  (ktr3i+<rli+X  )] 


To  write  down  P(* )  as  given  by  (10)  we  need  finally  to  integrate  (28)  over 
the  range  of  X2> 

For  several  distributions  of  P(-)  is  obtained  very  quickly.  We 
will  consider 


Case  1:  <r^  =  <r^,  o^.  =  and  G (X,)  gives  equal  weight  to  each  point 

on  the  sphere  =  D  .  Then  with  the  same  reasoning  used 


on  the  sphere  x^ 
to  obtain  (21)  we  get 


2-2  ,.21 


N  k+1  N  X  expl-W3  /2  (to3+<r1  +x  )3 

P(')  =  k=l<‘1)  <k>  [(<r^  U2)11'"1*  (ki  j-hr  J  +X2)] n^2 


Case  II:  The  density  of  is  given  by  (14).  Letting 


ko3i+ali  +  X 


arid  recalling  that 


V(D)  =  DnA(^) 


we  get 
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P(.)  =  2  (-l)k+1(^) 


xkn  it^)  znrl 


Dn  kn/i 


2  n  t. TZ  ± 

ir  (<r  + 

i=l  u 


•w-t  V,  •  ,v' 

K.  •  u. .  • 


1  i  **  ^ 

- -75  exp  [-  S  y  ]  dy  . .  .  dy 

(2w)n/2  2  i=l  1  1 


where  is  the  region  &  (ko^+o^+A2)  y^A  S  D2.  The 

evaluation  of  standard  normal  integrals  over  ellipsoidal  and 
spherical  regions  is  discussed  in  Section  1.  3  of  [1]  . 

Case  III.  The  density  of  la  given  by  (22),  A  routine  integration  yields 
^  k+l  N 

Hi)  f(,).  x(-i)k+1<”)  t— 2- -zTk n S — 2 — 2 — rr- 

k-l  5  ((.r‘1+X2)'k-1)  +  X?)l  1 

i*l 


CONCLUDING  REMARKS.  Although  the  definition  of  a  coverage 
problem  which  we  have  given  can  be  further  generalized,  many  of  the 
interesting  models  which  have  received  attention  are  special  cases  of 
jj  the  definition. as  we  have  given  it.  Certainly  there  are  models  which 
?■  may  be  of  interest  other  than  those  covered  in  the  Guenther -Terragno 

'  review  and  in  this  paper. 

-  '« 

i 

j  In  this  review  we  have  considered  only  the  zero-one  damage  function 

{  and  the  exponential  damage  function  given  by  (18).  Many  others  have  been 
proposed.  For  example,  another  possibility  that  has  some  merit  is 


PjfXj.Xj)  =  1, 


2  (Va)*  S  »2 

1=1 


^  ’  2  ^  fil  ^xli“x2i^  •  R  3  A  J  ■  j[l^xii'x2i 


)2>r 
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The  damage  function  (32)  is  found  in  [1]  but  the  topic  is  not  pursued.  Other 
damage  functions  are  mentioned  in  [16]  and  [17]  . 

The  first  step  for  a  potential  researcher  in  the  field  of  coverage 
problems  is  to  select  a  useful  and  realistic  model.  Having  made  that 
choice,  the  remainder  of  the  task  confronting  an  investigator  is  mainly 
numerical.  It  is  possible  that  most  or  all  of  the  computation  required 
is  already  available  in  the  literature  if  one  knows  where  to  look.  Even  if 
no  such  results  are  in  existence,  chances  are  excellent  that  probabilities 
of  interest  can  be  evaluated  if  one  is  clever  enough  in  handling  special 
functions  and  computers. 

Work  on  target  coverage  problems  has  suffered  from  a  mass  duplica¬ 
tion  of  effort.  This  is  in  part  due  to  (a)  some  company  publications  being 
difficult  if  not  impossible  to  obtain,  (b)  results  haying  been  published  not 
only  in  obscure  publications  but  also  in  many  different  journals  so  that 
it  is  difficult  to  keep  current  in  the  field,  and  (c)  some  papers  being 
difficult  to  read  unless  one  has  background  in  both  probability  and  target 
coverage.  . 
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1.  INTRODUCTION,  The  statistical  literature  is  abundant  with  results 
concerning  the  design  and  analysis  of  factorial  experiments.  Most  of  these 
results  relate  to  design  experiments  whose  intricate  balance  usually 
provides  orthogonal  contrasts  for  the  estimation  of  parameter  functions 
for  which  inferences  are  desired.  The  consequences  of  such  designs  are 
statietical  efficiency  of  estimation  with  exactness  of  estimation  theory 
and  simplicity  of  computational  procedures  thrown  inas'fringe  benefits'. 

Unfortunately,  however,  in  basic  and  operation  research  there  are 
many  situations  where  the  scientist  is  forced  to  draw  inferences  from 
data  which  have  not  arisen  from  carefully  balanced  factorial  experiments 
mainly  because  part  of  the  origin  of  his  data  is  beyond  his  control.  Thus 
we  may  be  concerned  with  an  analysis  of  operational  data  In  a  chemical 
plant  attempting  to  relate  the  quality  and  yield  of  the  output  to  various 
types  and  sources  of  input  materials,'  to  (different  types  of  catalysts,  to 
various  modes  of  operating  the  plant  such  as  temperature-and  pressure 
levels  and  running  times.  Even  if  it  is  possible  to  control  the  change  in 
the  various  input  factors  it  will  often  not  be  possible  to  conduct  balanced 
experiments.  Again  in  genetical  research  concerned  with  herltability 
studies  we* may  study  certain  traits  of  the  progeny  resulting  from  the 
mating  of  a  number  of  sires  each  to  a  different  set  of  dames.  We  may 
try  to  arrange  for  the  'breeding  pens'  of  the  progeny  trail  to  have  an 
equal  number  of  dames  In  each  but  the  progeny  resulting  from  each  mat¬ 
ing  is  beyond  the  control  of  the  experimenter,  resulting  in  an  'unequal 
number  nested  classification'  of  data,  Again,  in  medical  research  we 
may  wish  to  compare  the  follow-up  of  patients  who  have  received  different 
treatments.  Such  follow-up  data  are  often  classified  with  regard  to 
numerous  concomitant  characteristics  concerning  the  medical  history, 
environmental  and  genetical  background  of  patients  resulting  in  data 
arranged  in  completely  unbalanced  factorial  patterns.  There  is  clearly 
no  possibility  of  a  designed  experiment  here. 


,;<Thia  paper  gives  only  a  summary  of  some  of  the  results  derived  in 
more  detail  by  Hartley,  H.  0.  and  Rao,  J.  N,  K.  "Maximum  Likelihood 
Estimation  for  the  Mixed  Analysis  of  Variance  Model"  submitted  for 
publication  in  Biometrika. 
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We  do  not  need  to  add  further  examples  of  this  kind;  indeed  it  is 
generally  recognized  that  they  will  outnumber,  by  far,  the  situations  of 
data  from  balanced  experiments. 

In  the  case  of  balanced  designs  the  estimation  problem  for  the  con¬ 
stants  and  variances  involved  in  the  linear  model  theory  of  the  experimental 
data  has  been  extensively  treated:  Confining  ourselves  to  just  one  reference 
on  variance  estimation,  optimality  properties  of  the  classical  analysis  of 
variance  procedures  have  already  been  demonstrated  for  various  balanced 
designs  (see  e.g.  ,  Graybill  (1961)).  However,  results  for  unbalanced 
factorial  and  nested  data  are  much  more  restricted:  Henderson  (1953) 
has  suggested  a  method  of  unbiased  estimation  of  variance  components 
for  the  unbalanced  two-way  classification  but  his  method  is  computationally 
cumbersome  for  a  mixed  model  and  when  the  numbers  of  claases  is  large. 
Searle  and  Henderson  (1961)  have  suggested  a  simpler  method  also  for  the 
unbalanced  two  way  classification  with  one  fixed  factor  containing  a 
moderate  number  of  levels  and  a  random  factor  permitted  to  have  quite 
a  large  number  of  levels.  Bush  and  Anderson  (1963)  have  investigated 
for  the  two-way  classification  random  model  the  relative  efficiency  pf 
Henderson's  (1953)  method  and  two  other  methods,  A  and  B,  based  on  the 
respective  methods  of  fitting  constants  and  weighted  squares  of  means 
.described  by  Yates  (1934)  for  experiments  based  on  a  fixed  effects  model 
which  also  provide  unbiassed  estimates  of  variance  components.  Possi¬ 
bilities  of  generalizations  are  indicated.  In  all  the  above  methods  the 
estimates  of  any  constants  in  the  model  are  computed  from  the  'Aitken 
Type1  weighted  least  squares  estimators  based  on  the  exact  variance- 
covariance  matrix  of  the  experimental  responses  which  involves  the 
unknown  variance  ratios.  The  estimation  of  the  latter  is  then  based  on 
various  unbiassed  procedures  so  that  little  is  known  about  any  optimality 
properties  of  any  of  the  resulting  estimators.  However,  all  these  methods 
reduce  to  the  well  known  procedures  based  on  minimal  sufficient  statistics 
in  the  special  cases  of  balanced  designs. 

The  method  of  maximum  likelihood  estimation  here  developed  differs 
from  the  above  in  that  maximum  likelihood  equations  are  used  and  solved 
for  both  the  estimates  of  constants  and  variances.  This  method  has 
apparently  not  been  used  by  the  above  authors  (and  is  indeed  'rejected' 
by  BuBh  and  Anderson,  1963)  because  the  computational  effort  is  not  (in 
their  view)  warranted  by  the  known  properties  of  maximum  likelihood 
estimation.  This  point  is  well  taken.  However,  we  have  nevertheless 
undertaken  to  develop  this  theory  on  the  following  grounds: 
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(a)  Within  reason  and  with  the  help  ot  suitable  numerical 

the  argument  of  computational  labor  looses  its  stigma  with  the 
progress  in  computer  technology. 

(b)  Our  technique  of  maximum  likelihood  estimation  provides  a 
numerical  analysis  for  the  completely  general  mixed  model 
and  does  not  require  the  development  of  new  devices  whenever 
a  more  involved  situation  of  unbalanced  factorial  data  arises. 
Moreover,  it  provides  the  basis  for  a  completely  general 
'analysis  of  variance  test*  procedure  in  the  form  of  'likelihood-- 
ratio  testa1. 

(c)  We  have  established  large  sample  optimality  properties  and 

it  is  already  apparent  that  for  small  experiments  the  amount  of 
computational  labor  is  quite  comparable  with  that  involved  in 
alternatives.  Here  our  technique  will  permit  Monte  Carlo 
evaluations  of  small  sample  variances  (on  the  lines  made  by 
Bush  and  Anderson)  for  the  maximum  likelihood  estimators. 

For  really  large  experiments  (such  as  arise  with  certain 
genetical  problems)  the  large  Sample  optimality  properties  of 
maximum  likelihood  estimators  should  provide  a  cleat  justi¬ 
fication  of  additional  computer  time  (if  any). 

(d)  Recent  researches  in  identifying  minimal  sufficient  statistics 
for  the  estimation  of  the  parameters  (see  e.  g.  ,  Hultquist  and 
Graybill,  (1965)  Furukawa  (i960))  is  at  this  time  confined  to 
several  special  designs.  Since  a  universal  method  of  identifying 
such  statistics  when  they  exist  is  not  available  it  is  a  consider¬ 
able  (small  sample)  advantage  of  maximum  likelihood  estimators 
that  they  will  automatically  be  functions  of  such  statistics  when¬ 
ever  they  exist. 

(e)  Our  estimates  of  variance  components  are  always  >  0  (see  section 
4)  and  whilst  the  alternative  estimators  could  be  modified  to 

also  be  >  0  they  would  thereby  loose  the  property  of  unbiassedness 
which  is  the  main  justification  of  their  use. 

2.  SPECIFICATION  OF  THE  GENERAL  MIXED  MODEL.  The  speci¬ 
fication  of  the  general  mixed  model  will  be  sufficiently  general  to  cover 
most  of  the  situations  of  unbalanced  factorial  data  arising  in  practice. 
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On  the  other  hand,  it  utilises  certain  specific  feature:  v.»Hch  distinguish 
analysis  of  variance  models  from  a  completely  general  linear  model 
involving  both  'constants'  as  well  as  random  variables. 

The  linear  model  here  treated  is  given  by 

(1)  y  =  Xa  +  U.b,  +  . . .  +  U  b  +  e 

11  c  c 

where 

X  is  an  n  x  k  matrix  of  known  fixed  numbers 
Uj  is  an  n  x  rru  matrix  of  known  fixed  numbers 
a  is  a  k  x  1  vector  of  unknown  constants 

b,  ia  anm^xl  vector  of  independent  variables  from  N(0, 0^“) 

2 

e  is  an  n  x  1  vector  of  independent  variables  from  N(0,  <r  ). 

The  random  vectors  b.,  b_,  ....  b  ,  and  e  are  mutually  independent 
and  y  is  given  by  (l).  C 

We  assume  that  the  design  matrices  X  and  are  all  of  full  rank 

i.  e.  ,  the  rank  of  X  is  k  and  the  rank  of  is  try  In  terms  of  analysis 

of  variance  terminology  the  vector  of  constants  a  comprises  in  its 

elements  all  levels  of  all  fixed  factors,  1.  e.  ,  the  levels  of  all  fixed 

main  effects  and  interactions  appropriately  re-parameterised  so  that 

the  design  matrix  X  has  full  rank.  For  the  c  random  factors  we  are 

keeping  the  components  separate  since  all  elements  of  b.  have  the  same 

2  * 
unknown  variance  <r  .  Usually  (with  analysis  of  variance  models) 

*  th 

each  y  is  associated  with  precisely  one  level  of  the  i  random  factor 

so  that  the  design  matrix  will  have  in  each  row  precisely  one  1  and 

the  remaining  rn^-1  elements  zero.  We  therefore  assume  that  the 

have  this  property  which  imples  that  ail  m.  x  m,  matrices  U  ,'U,  are 
diagonal. 
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One  additional  important  assumption  must  be  made  about  the  design 
matrices  which  may  be  described  as  a  condiiio»«  fu*  estimability  of 
a  and  <r  Denote  by 

c 

(2)  m  s  2  m, 

1=1 

the  total  number  of  levels  in  all  random  components.  Then  the  adjoined 
n  x  (k+m)  matrix 

(3)  M  .(X  1  U,  |  .  ..  |  UJ 

is  assumed  to  have  as  a  base  an  n  x  r  matrix  W  of  the  form 

(4)  W  «  (X  |  U*) 

where  the  n  x  (r-k)  matrix  U  must  contain  at  least  one  column  from 
each  so  that 

(5)  k  +  c  <  r  <  k  +  m. 

3.  THE  LIKELIHOOD  EQUATIONS.  From  .(l)  It  is  obvious  that  y 
follows  a  multivariate  normal  distribution  with  variance --covariance 
matrix 

(6)  (T2  H  =  or2  {  I  +y,  U.  U.'  +  .  ..  +y  U  U*  > 

'  '  v  n  '1  1  1  'c  .  c  c 

where 

(7)  '  i  /  * 

Hence  the  likelihood  of  y  is  given  by 

L  =  (2ir)'^n  <r""  |  H|".^exp  {-(y  -  Xa)'H-1(y  -  Xa)/2  <r2}  . 


(8) 
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The  differentiation  of  the  log  likelihood 
(9)  A  «  Log  L 

with  regard  to  a .  o-  and  y.  yield*  the  equation* 


~  »  tr*2  {X<  H-1y  -  (X'  H_1X)o  }  *  0 

OG 


I;  +Jj  (y  - *•)’  h'V  -  x»)  •  o 

cr 


SA  ,  .  ,„-l  SH  .  1  ,  v  8H  \  v  . 

—  =  - j  tr  {H  — -  }  -  — =  (y  -  Xa)'  -j— -(y  -  Xa) 


■  -  i  tr  (H^UjUj)  +-~  (y  -  Xa)'  hS^H’V  -Xa). 

2(r 

Whitlst  it  has  long  been  recognised  that  equations  (10)  and  (ll)  readily 
yield  the  maximum  likelihood  estimate*  a  and  <r  a*  functions  of  the  y, 

8X  * 

involved  in  H,  the  solution  of  equations  (12)  i.  e. ,  ■*—  =  0  ha*  not  been 

attempted  in  the  past.  We  give  in  the  next  section  a  numerical  proce¬ 
dure  of  solving  the  simultaneous  equation  (10),  (11),  and  ~  =  0  given 
by  (12).  Wl 

4.  SOLUTION  OF  THE  MAXIMUM  LIKELIHOOD  EQUATIONS  BY 


STEEPEST  ASCENT.  As  mentioned  in  3.  the  equations  (10)  and  (11) 
are  readily  solved  for  a  and  «r^  in  terms  of  the  y^:  --We  obtain  the 
familiar  answers  for  'weighted  least  squares' 

(13)  a  =  (x'  H’1  X)'1  (x'  H’ly) 


n  a2  =  y'H’ly  -  (X  H’VHx'H^Xfta'H’Sr)  . 
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-  ~2 

Equations  (13)  and  (14)  yield  a  and  <r  in  terms  of  the  y  and  y^.  We 

require  symbols  for  this  functional  relationship  and  write  in  place  of 
(13)  and  (14) 

(15)  a  *  a^) 
and 

(16)  *  =  ir(yi)  . 


Substitution  of  (15)  and  (16)  in  (12)  and  equating  to  zero  would  yield 
c  simultaneous  equations  for  the  c  values  of  y^.  The  solutions  of  these 
equations  are  now  obtained  as  the  asymptotic  limits  of  a  system  of  c 
simultaneous  differential  equations,  namely  the  equations  of  steepest 
ascent  given  by 

in)  -37  =  ^  (« (*!>■  «<rl).  n> 

as 

where  the  k  +  1  +  c  argument  function  jp  (a.o-.y^  i»  given  by  the  right 
hand  side  of  (12)  and  (15)  and  (16)  are  substituted  for  d  and  r  . 

The  variable  of  integration,  t,  in  (17)  is  auxiliary  and  the  numerical 
integration  of  (17)  commences  at  initial  trial  values  y^  (usually  chosen 
as  consistant  estimators)  so  that 

(18)  yt  =  0Y{  at  t  =  0. 

It  can  now  be  shown  that  a  s  t  •*  w 


(19) 

lim 

Y A(t)  =  Yi  (*ay) 

t  ■*« 

and 

/ 

(20) 

Urn 

(a  (Y^,  r  (y^,  yj  =  0 
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Therefore,  y,  together  with  u ,[y  )  r(y  )  represent  »  "Mutton  of  the  maximum 
*  *  8X  * 

likelihood  equations  (10), (11),  and  —  =0  given  by  (12).  It  should  be  noted 

% 

that  although  the  limit  along  a  specific  path  of  integration  is  unique  as 
t  -*•*  it  does  not  follow  that  there  is  only  one  solution  of  the  maximum 
likelihood  equations  since  a  change  in  the  starting  point  y  may  give 
rise  to  a  different  path  of  integration. 


Finally  we  should  comment  on  a  modification  of  our  steepest  ascent 
integration  which  ensures  that  y .  =  0  along  the  path*  First  observe  that 

1  4 

the  log  likelihood  is  a  differentiable  function  of  =  y^  which  iB 
symmetrical  at  r ^  =  0.  It  follows  that  if  t\  iB  used  as  a  parameter  in 
place  of  y  we  have 


9X  8X 

8t,  =  8^  *  iTi 


'i 


Therefore,  the  steepest  ascent  differential  equations  (17)  can  be  replaced 
by 


(22) 


8\  ,~ 


2ti  <yt)>  y i)  • 


The  integration  would  commence  at  positive  values  Qy^  but  should  the 
path  of  integration  reach  a  point  where  one  or  several  of  the  =  0,  a 
new  integration  would  be  started  at  that  point  and  the  one  or  several  t, 

a 

would  be  held  at  t.  =  0  for  the  rest  of  the  integration  path.  The  limit  as 
t  -*  «  will  again  be  a  solution  of  the  likelihood  equations 


(23) 


=  0. 


This  procedure  ignores  and  avoids  any  possible  solutions  of  the  likelihood 
equations  with  y^  <  0. 
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It  would  carry  us  to  far  afield  if  wc  were  to  discuss  in  this  paper 
computational  details  of  solving  the  system  of  c  ordinary  first  order 
differential  equations  (17)  or  (22).  It  suffices  to  state  that  a  large  step 
(high  order)  Runge-lCutta  procedure  (see  e.g.  ,  Henrici  (1962))  ie  found 
to  be  quite  serviceable.  For  large  u  (i.e.  ,  n  >  50)  numerical  inversion 
of  the  n  x  n  matrix  H  involved  in  (12),  (13),  and  (14)  can  be  completely 
avoided  by  reducing  this  task  to  operations  involving  only  matrix  inver- 
sions  of  order  m  x  m  where  m  =  £  m^  on  lines  similar  to  Henderson 
et  al  (1959).  The  relevant  equation  is 

(24)  H'1  =  I  -  Z  (Z'Z  +  I)"1  Z' 
where 

(25)  Z  is  the  adjoined  n  x  m  matrix 

z  =  u,  i ...  ivvcue)  .• 

With  the  help  of  (24)  the  computational  work  is  quite  manageable  on  high  ¬ 
speed  computers  and  a  program  is  in  preparation  covering  data  for  which 
n  <  500,  c<5  k<  150,  m<  150.  The  computer  time  on  the  IBM  7094 
is  estimated  to  range  between  5  minutes  and  2  hours  largely  depending 
on  the  magnitudes  of  m  and  k. 
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